how to sort on comma delimited fields enclosed in double quotes

Forum rules
Before you post please read how to get help
Post Reply
coffeeking
Level 1
Level 1
Posts: 20
Joined: Mon Dec 23, 2019 8:54 pm

how to sort on comma delimited fields enclosed in double quotes

Post by coffeeking »

Can someone tell me how to sort this file on fields 5, then 4, then 12, then 14? Thank you.

"OH0010000000","14","7400076","LAST","FIRST","MIDDLE","","1954-04-13","1974-0
4-08","ACTIVE","R","123 ANYSTREET","","BLANCHESTER","OH","45107","","","","","",
"","","","","","","GREAT OAKS CAREER CAMPUSES","","","","15","12","SOUTHERN OHIO
ESC","","","BLANCHESTER LOCAL SD (CLINTON)","","MARION","14ABE","10","91","17",
"MARION TOWNSHIP","","","R","X","R","X","R","X","R","X","","X","","","","X","","
R","X","","","X","","","R","","X","","","","","","X","R","","","X","","","X","R"
,"X","","","","X","R","X","R","","X","R","","","X","X","","X","R","","X","","","
X","R"
"OH0010000004","14","10100929","LAST","FIRST","MIDDLE","SR","1961-10-07","2001-0
9-20","ACTIVE","R","456 ANYSTREET","","WILMINGTON","OH","45177","","","","","",
"","","","","","","GREAT OAKS CAREER CAMPUSES","","WILMINGTON CITY SD","","15","
12","","","","","","GREEN","14ABY","10","91","17","GREEN TOWNSHIP","","","","","
","","","","","","","X","","","","","","","X","X","","X","","","","","X","","","
","","","","","","","","","","","","X","","","","","","","","","","R","","","X",
"","","","","","","","","",""
"OH0010000006","14","10001812","LAST","FIRST","","","1957-09-20","2000-10-03
","ACTIVE","R","789 ANYSTREET,"","MIDLAND","OH","45148","","","","","","","",""
,"","","","GREAT OAKS CAREER CAMPUSES","","","","15","12","SOUTHERN OHIO ESC",""
,"","BLANCHESTER LOCAL SD (CLINTON)","","MARION","14ABE","10","91","17","MARION
TOWNSHIP","","","","X","","","","","","","","X","","","","","","","X","","","","
","","","","X","","","","","","X","","","","","","","X","","X","","","","","",""
,"","","X","R","","","X","","","X","","","X","","","X",""
Last edited by MrEen on Thu Jul 30, 2020 7:13 pm, edited 1 time in total.
Reason: Removed personal information from appropriate fields

freshminted
Level 1
Level 1
Posts: 37
Joined: Fri May 01, 2020 12:26 am

Re: how to sort on comma delimited fields enclosed in double quotes

Post by freshminted »

Save it all as a CSV (Comma Separated Variable) text file with your favourite text editor (Xed etc). You'll have to join up all those lines, it looks like there should really only be three lines of csv data. The load it into Libreoffice Calc (or M$ Excel) and go from there.

User avatar
deck_luck
Level 5
Level 5
Posts: 825
Joined: Mon May 27, 2019 6:57 pm
Location: R-4808 North

Re: how to sort on comma delimited fields enclosed in double quotes

Post by deck_luck »

coffeeking wrote:
Thu Jul 30, 2020 6:38 pm
Can someone tell me how to sort this file on fields 5, then 4, then 12, then 14? Thank you.
...
Please be careful, when posting information about other people. It could be invading their privacy, and I would encourage you change the data before posting it.

There are many different was to do this. Assuming the three examples are three unique lines ending with a new line character, you can use the "sort" command to perform the tasks, and you can learn about it reading the man sort. The -t delimiter and the -k fields will be helpful.
🐧Linux Mint 19 XFCE 💡Give a friend a fish, and you feed them for a day. Teach a friend how to fish, and you feed them for a lifetime. ✝️ Proverbs 4:7 Wisdom is the principal thing; therefore get wisdom: and with all thy getting get understanding.

User avatar
MrEen
Level 20
Level 20
Posts: 11542
Joined: Mon Jun 12, 2017 8:39 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by MrEen »

deck_luck wrote:
Thu Jul 30, 2020 7:02 pm
Please be careful, when posting information about other people. It could be invading their privacy, and I would encourage you change the data before posting it.
Removed personally identifiable information. Thanks for mentioning that. :D

coffeeking
Level 1
Level 1
Posts: 20
Joined: Mon Dec 23, 2019 8:54 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by coffeeking »

Sorry, should have thought about posting personally identifiable data. Won't do that again!

I have been using the 'sort' command with the '-t' and '-k' parameters but keep getting the error message "multi-character tab '-k5,1'"

I'm really a little unclear on how to use the '-k' parameter ---- i.e., on exactly what the syntax of this parameter is.

coffeeking
Level 1
Level 1
Posts: 20
Joined: Mon Dec 23, 2019 8:54 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by coffeeking »

Somewhat in my defense, all of this data is publicly available, nothing confidential. However, in future, I'll mask all personally identifiable info.

Here is an example of a command I've been trying to run

sort -t -k5,1 -k4,1 -k12,1 -k14,1 filetosort.txt > filesorted.txt

And, as I say, I am consistently getting the message "sort: multi-character tab '-k5,1'"

I have no idea what "multi-character tab" refers to (it is a straight csv file, with comma delimiters --- no tab character anywhere in the file as far as I know)

The double quote marks around each field may be the tricky part. Is there something I need in the sort command to take these double quote marks into account?

rene
Level 16
Level 16
Posts: 6519
Joined: Sun Mar 27, 2016 6:58 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by rene »

This should do it it seems:

Code: Select all

rene@t5500:~$ cat data
x,x,x,2,2,x,x,x,x,x,x,2,x,2
x,x,x,2,2,x,x,x,x,x,x,2,x,1
x,x,x,2,2,x,x,x,x,x,x,1,x,2
x,x,x,2,2,x,x,x,x,x,x,1,x,1
x,x,x,1,2,x,x,x,x,x,x,2,x,2
x,x,x,1,2,x,x,x,x,x,x,2,x,1
x,x,x,1,2,x,x,x,x,x,x,1,x,2
x,x,x,1,2,x,x,x,x,x,x,1,x,1
x,x,x,2,1,x,x,x,x,x,x,2,x,2
x,x,x,2,1,x,x,x,x,x,x,2,x,1
x,x,x,2,1,x,x,x,x,x,x,1,x,2
x,x,x,2,1,x,x,x,x,x,x,1,x,1
x,x,x,1,1,x,x,x,x,x,x,2,x,2
x,x,x,1,1,x,x,x,x,x,x,2,x,1
x,x,x,1,1,x,x,x,x,x,x,1,x,2
x,x,x,1,1,x,x,x,x,x,x,1,x,1
rene@t5500:~$ sort -t, -k5,5 -k4,4 -k12,12 -k14,14 data
x,x,x,1,1,x,x,x,x,x,x,1,x,1
x,x,x,1,1,x,x,x,x,x,x,1,x,2
x,x,x,1,1,x,x,x,x,x,x,2,x,1
x,x,x,1,1,x,x,x,x,x,x,2,x,2
x,x,x,2,1,x,x,x,x,x,x,1,x,1
x,x,x,2,1,x,x,x,x,x,x,1,x,2
x,x,x,2,1,x,x,x,x,x,x,2,x,1
x,x,x,2,1,x,x,x,x,x,x,2,x,2
x,x,x,1,2,x,x,x,x,x,x,1,x,1
x,x,x,1,2,x,x,x,x,x,x,1,x,2
x,x,x,1,2,x,x,x,x,x,x,2,x,1
x,x,x,1,2,x,x,x,x,x,x,2,x,2
x,x,x,2,2,x,x,x,x,x,x,1,x,1
x,x,x,2,2,x,x,x,x,x,x,1,x,2
x,x,x,2,2,x,x,x,x,x,x,2,x,1
x,x,x,2,2,x,x,x,x,x,x,2,x,2
Note that quotes are not an issue if they are around every field.

coffeeking
Level 1
Level 1
Posts: 20
Joined: Mon Dec 23, 2019 8:54 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by coffeeking »

Hmm, yes I see that that is working.

I'm wondering now if I may not have another "end of line" problem --- i.e. are all of my lines ending with the required line feed?

This is another file I acquired from a non-unix world, so it probably doesn't have line feeds.

When I execute "file filename" it says "ASCII text with very long lines" but doesn't tell me what character is at the end of the lines (whereas with a previous file it told me the lines ended in carriage returns)

There must be end of line characters of some kind --- since "more filename" displays the records as you would expect ---- ending where they end.

How do I find out what control character is at the end of the lines?

rene
Level 16
Level 16
Posts: 6519
Joined: Sun Mar 27, 2016 6:58 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by rene »

Easiest is to simply look at the file in a(ny) hex-editor/viewer. F.e. sudo apt-get install xxd, xxd -g 1 the_file | less and look at the spot where you feel/know a newline should be.

Note; while I expect it was only a copy/paste thing, your originally pasted in file has lines that are cut short.

rene
Level 16
Level 16
Posts: 6519
Joined: Sun Mar 27, 2016 6:58 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by rene »

Oh, and by the way, didn't feel it useful to additionally remark upon after showing an actual command line, but since you didn't ack that part: I hope you are not still/again missing that , after the -t? I.e., that which tells sort that the records are comma-seperated?

coffeeking
Level 1
Level 1
Posts: 20
Joined: Mon Dec 23, 2019 8:54 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by coffeeking »

Thanks. This hex viewer is showing hex '0a' at the end of the records, so evidently the required line feed is there.

But now I'm stumped again. If the problem I'm having sorting this file is not a end of line problem, then I don't know what it is.

This file is very big, and the records are very big. Could I be encountering a size limit of some kind for the sort command?

I am able to sort the file, however. It is only when I attempt to sort using keys (the -k parameter) that I get the error message "sort: multi-character tab '-k5,5'"

I have no idea what this message means.

User avatar
Kadaitcha Man
Level 9
Level 9
Posts: 2535
Joined: Mon Aug 27, 2012 10:17 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by Kadaitcha Man »

coffeeking wrote:
Fri Jul 31, 2020 8:53 am
Could I be encountering a size limit of some kind for the sort command?
You were given accurate details about that only a few days ago.
It's pronounced kad-eye-cha, not kada-itcha.

rene
Level 16
Level 16
Posts: 6519
Joined: Sun Mar 27, 2016 6:58 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by rene »

coffeeking wrote:
Fri Jul 31, 2020 8:53 am
I have no idea what this message means.
Yes you do. Look one message above your own.

coffeeking
Level 1
Level 1
Posts: 20
Joined: Mon Dec 23, 2019 8:54 pm

Re: how to sort on comma delimited fields enclosed in double quotes

Post by coffeeking »

Oh my gosh! So sorry! I was doing that correctly initially but in my haste and confusion subsequently forgot it!! What a rookie error!

Sort seems to be working now!

Thank you again for your help. I appreciate so much the help of the experts and "old hands" on this forum!!

P.S. Yes, you're right about the size question. But every time I'm having problems getting the sort to work, I keep thinking that the size of the file (millions of records) or the size of the record (several hundred bytes) must be the problem. I am amazed that linux is able to manage these very big numbers with such ease!

User avatar
deck_luck
Level 5
Level 5
Posts: 825
Joined: Mon May 27, 2019 6:57 pm
Location: R-4808 North

Re: how to sort on comma delimited fields enclosed in double quotes

Post by deck_luck »

Been there. I call it target fixation when you keep staring at the code and focusing on what you believe is the problem and ignoring the rest. Countless times I will walk away to get a coffee or just do something else for an hour, and when I come back to the erroneous code it slaps me in the forehead. Why didn't I see that before. :lol:


Kadaitcha Man is the master of creepy disguises.
🐧Linux Mint 19 XFCE 💡Give a friend a fish, and you feed them for a day. Teach a friend how to fish, and you feed them for a lifetime. ✝️ Proverbs 4:7 Wisdom is the principal thing; therefore get wisdom: and with all thy getting get understanding.

Post Reply

Return to “Scripts & Bash”