[python] Python: CSV write by column rather than row

I have a python script that generates a bunch of data in a while loop. I need to write this data to a CSV file, so it writes by column rather than row.

For example in loop 1 of my script I generate:

(1, 2, 3, 4)

I need this to reflect in my csv script like so:

Result_1    1
Result_2    2
Result_3    3
Result_4    4

On my second loop i generate:

(5, 6, 7, 8)

I need this to look in my csv file like so:

Result_1    1    5
Result_2    2    6
Result_3    3    7
Result_4    4    8

and so forth until the while loop finishes. Can anybody help me?


EDIT

The while loop can last over 100,000 loops

This question is related to python csv

The answer is


wr.writerow(item)  #column by column
wr.writerows(item) #row by row

This is quite simple if your goal is just to write the output column by column.

If your item is a list:

yourList = []

with open('yourNewFileName.csv', 'w', ) as myfile:
    wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
    for word in yourList:
        wr.writerow([word])

Read it in by row and then transpose it in the command line. If you're using Unix, install csvtool and follow the directions in: https://unix.stackexchange.com/a/314482/186237


what about Result_* there also are generated in the loop (because i don't think it's possible to add to the csv file)

i will go like this ; generate all the data at one rotate the matrix write in the file:

A = []

A.append(range(1, 5))  # an Example of you first loop

A.append(range(5, 9))  # an Example of you second loop

data_to_write = zip(*A)

# then you can write now row by row

Updating lines in place in a file is not supported on most file system (a line in a file is just some data that ends with newline, the next line start just after that).

As I see it you have two options:

  1. Have your data generating loops be generators, this way they won't consume a lot of memory - you'll get data for each row "just in time"
  2. Use a database (sqlite?) and update the rows there. When you're done - export to CSV

Small example for the first method:

from itertools import islice, izip, count
print list(islice(izip(count(1), count(2), count(3)), 10))

This will print

[(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9), (8, 9, 10), (9, 10, 11), (10, 11, 12)]

even though count generate an infinite sequence of numbers


Let's assume that (1) you don't have a large memory (2) you have row headings in a list (3) all the data values are floats; if they're all integers up to 32- or 64-bits worth, that's even better.

On a 32-bit Python, storing a float in a list takes 16 bytes for the float object and 4 bytes for a pointer in the list; total 20. Storing a float in an array.array('d') takes only 8 bytes. Increasingly spectacular savings are available if all your data are int (any negatives?) that will fit in 8, 4, 2 or 1 byte(s) -- especially on a recent Python where all ints are longs.

The following pseudocode assumes floats stored in array.array('d'). In case you don't really have a memory problem, you can still use this method; I've put in comments to indicate the changes needed if you want to use a list.

# Preliminary:
import array # list: delete
hlist = []
dlist = []
for each row: 
    hlist.append(some_heading_string)
    dlist.append(array.array('d')) # list: dlist.append([])
# generate data
col_index = -1
for each column:
    col_index += 1
    for row_index in xrange(len(hlist)):
        v = calculated_data_value(row_index, colindex)
        dlist[row_index].append(v)
# write to csv file
for row_index in xrange(len(hlist)):
    row = [hlist[row_index]]
    row.extend(dlist[row_index])
    csv_writer.writerow(row)

As an alternate streaming approach:

  • dump each col into a file
  • use python or unix paste command to rejoin on tab, csv, whatever.

Both steps should handle steaming just fine.

Pitfalls:

  • if you have 1000s of columns, you might run into the unix file handle limit!