Hello I have one big text file like this
BIGFILE.TXT
COLUMN1,COLUMN2,COLUMN3,COLUMN4,COLUMN5,COLUMN6,COLUMN7,COLUMN8 # HEADER
11/24/2013,50.67,51.22,50.67,51.12,17,0,FILE1
11/25/2013,51.34,51.91,51.09,51.87,23,0,FILE1
12/30/2013,51.76,51.82,50.86,51.15,13,0,FILE1
12/31/2013,51.15,51.33,50.45,50.76,18,0,FILE1
1/1/2014,50.92,51.58,50.84,51.1,19,0,FILE2
1/4/2014,51.39,51.46,50.95,51.21,14,0,FILE2
1/7/2014,51.08,51.2,49.84,50.05,35,0,FILE2
1/8/2014,50.14,50.94,50.01,50.78,100,0,FILE3
1/11/2014,50.63,51.41,50.52,51.3,190,0,FILE3
1/15/2014,54.03,55.74,53.69,54.93,110,0,FILE4
1/19/2014,53.67,54.19,53.55,53.82,24,0,FILE4
1/20/2014,53.83,54.26,53.47,53.53,23,0,FILE4
1/21/2014,53.8,54.55,53.7,54.1,24,0,FILE4
1/26/2014,53.26,53.93,53.23,53.65,31,0,FILE5
1/27/2014,53.78,54,53.64,53.81,110,0,FILE5
I'm looking for way how to split this file into multiple text files.
In this case one file would be split into 5 text files. Name of each
text file would be taken from column number 8. The big file is comma
delimited. So the output would be:
FILE1.txt COLUMN1,COLUMN2,COLUMN3,COLUMN4,COLUMN5,COLUMN6,COLUMN7,COLUMN8
# HEADER
11/24/2013,50.67,51.22,50.67,51.12,17,0,FILE1 11/25/2013,51.34,51.91,51.09,51.87,23,0,FILE1 12/30/2013,51.76,51.82,50.86,51.15,13,0,FILE1 12/31/2013,51.15,51.33,50.45,50.76,18,0,FILE1 FILE2.TXT COLUMN1,COLUMN2,COLUMN3,COLUMN4,COLUMN5,COLUMN6,COLUMN7,COLUMN8
# HEADER
1/1/2014,50.92,51.58,50.84,51.1,19,0,FILE2 1/4/2014,51.39,51.46,50.95,51.21,14,0,FILE2 1/7/2014,51.08,51.2,49.84,50.05,35,0,FILE2 FILE3.TXT COLUMN1,COLUMN2,COLUMN3,COLUMN4,COLUMN5,COLUMN6,COLUMN7,COLUMN8more than 250 000 lines.. And the output should have more than 1000 small files. So I'm looking for a way how to generate more than 1000 small files automaticly. The file name for all small files is always stored in column #8. One good soul already helped me with a code:
# HEADER
1/8/2014,50.14,50.94,50.01,50.78,100,0,FILE3 1/11/2014,50.63,51.41,50.52,51.3,190,0,FILE3 . .
The "big file" has$src = "c:tempreallybig.csv" # Source file $dst = "c:tempfile{0}.csv" # Output file(s) $reader = new-object IO.StreamReader($src) # Reader for input while(($line = $reader.ReadLine()) -ne $null){ # Loop the input $match = [regex]::match($line, "(?i)file(d)") # Look for row that ends with file-and-number if($match.Success){ # Add the line to respective output file. SLOW! add-content $($dst -f $match.Groups[0].value) $line } } $reader.Close() # Close the input file
But there should be made 3 small (?) changes to the code to work exactly how I need.
Change #1.
In
fact (FILE1,FIILE2…FILEN) I used only as a example (probably bad
example). There can by ANY text string(A, ABC, E, EEEE, S, STAA …) in
column 8. Actually the big file is sorted alphabetically by column 8.
fact (FILE1,FIILE2…FILEN) I used only as a example (probably bad
example). There can by ANY text string(A, ABC, E, EEEE, S, STAA …) in
column 8. Actually the big file is sorted alphabetically by column 8.
Change #2.
Each small file should include first line(header) of a big file.
Change #3.
Name of output files should be "TEXT STRING IN COLUMN8"+TXT"
Thank you for any help.
John
|