比如:下面JSON, 从列转行 和 行转列,如何快速实现? 尽量保证在O(0)到O(n)的时间复杂度。
[
{"field":"date","data":[20080102,20080103,20080104,20080105,20080106,20080107,20080108,20080109,20080110]},
{"field":"x1","data":[1.5916,2.5916,3.5916,3.5916,3.9916,2.5916,9.5916,3.5916,3.5916]},
{"field":"x2","data":[1.4916,7.5916,1.5916,2.5916,3.116,1.5916,7.5916,0.5916,3.5916]}
]
列转行
[
{"date":20080102,"x1":1.5916,"x2":1.4916},
{"date":20080103,"x1":1.5916,"x2":1.4916},
{"date":20080104,"x1":1.5916,"x2":1.4916},
{"date":20080105,"x1":1.5916,"x2":1.4916},
{"date":20080106,"x1":1.5916,"x2":1.4916}
]
评论就不单独回复了:
自己实现的算法 O(m+n),见文件中的 exports.col2row2 函数定义。 https://github.com/bsspirit/ape-algorithm/blob/master/rowcol.js
trans = (data) ->
results = []
for item in data
for value, index in item.data
results[index] ?= {}
results[index]["#{item.field}"] = value
results
O(m*n)
列转行转过来结果是不是应该是: [ {"date":20080102,"x1":1.5916,"x2":1.4916}, {"date":20080103,"x1":2.5916,"x2":7.4916}, {"date":20080104,"x1":3.5916,"x2":1.4916}, {"date":20080105,"x1":3.5916,"x2":3.4916}, {"date":20080106,"x1":2.5916,"x2":1.4916}. … ]
写了一个速度测试:idy/53eae4fff4b616a82f025533
其实从根本的角度考虑一下,你想做行列变换,几乎要把每个元素都移动一遍才可以,所以每个元素至少都要访问一遍,复杂度至少是 O(n*m)
的。就像你读书,不管你读的是电子书,还是纸质书,先读目录在读正文,或者从后面往前读,你总要每个字都看一遍才算读完的,复杂度最少就是字的个数。
用 eval
只是省去你写代码访问每行的元素,但是 eval
中包含了列数那么多行的赋值代码,如果你把 eval
里面的代码粘贴在 eval
那里,你再看看复杂度,还是 O(n*m)
么。
下面是测试结果:
Speed test
Scale at: 800 x 800
✓ col2row (392ms)
✓ col2row2 (454ms)
✓ col2row should equal to col2row2 (1ms)
Scale at: 400 x 1600
✓ col2row (315ms)
✓ col2row2 (337ms)
✓ col2row should equal to col2row2
Scale at: 200 x 3200
✓ col2row (412ms)
✓ col2row2 (936ms)
✓ col2row should equal to col2row2
Scale at: 100 x 6400
✓ col2row (387ms)
✓ col2row2 (264ms)
✓ col2row should equal to col2row2
Scale at: 50 x 12800
✓ col2row (331ms)
✓ col2row2 (97ms)
✓ col2row should equal to col2row2
Complexity test
Scale at: 100 x 100
✓ linear
✓ col2row2 (3ms)
✓ col2row (5ms)
Scale at: 200 x 200
✓ linear
✓ col2row2 (30ms)
✓ col2row (26ms)
Scale at: 400 x 400
✓ linear (1ms)
✓ col2row2 (96ms)
✓ col2row (87ms)
Scale at: 800 x 800
✓ linear
✓ col2row2 (552ms)
✓ col2row (722ms)
Scale at: 1600 x 1600
✓ linear
✓ col2row2 (2672ms)
✓ col2row (1854ms)
30 passing (10s)
@ravenwang 虽然不是O(n+m) 但是测试之后感觉 列数较小的时候 eval 会快很多,因为eval 执行的语句带缓存。 我的repo里解释了一下 ;) 下面这个case:
Scale at: 50 x 12800
✓ col2row (331ms)
✓ col2row2 (97ms)
✓ col2row should equal to col2row2