One of my friends had a requirement to pull apart from JSON data, stripping out specific "columns" of data, ideally using standard Unix/Linux commands, again ideally in one single command.
{
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
}
This is with what I came up :-)
So here's an example of the JSON data: -
davehay.json
{
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
}
and this is the desired output: -
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
and this is the script that I developed: -
sed -n '/familyName/,/eddress/p' davehay.json | sed 's/,//' | sed 's/"//g' | awk '{printf "%s\n", $2}'
The way this works is: -
- The first sed command pulls out everything between the familyName and eddress columns
- The second sed command strips out the comma ( , ) terminator
- The third sed command strips out the double quotes ( " )
- The awk command prints the second column, thus just the data, rather than the column header
If I wanted this: -
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
then I'd adapt the command: -
sed -n '/givenName/,/eddress/p' davehay.json | sed 's/,//' | sed 's/"//g' | awk '{printf "%s\n", $2}'
Like most things, I bet there're a million ways to achieve the same objective :-)
But this'll do for me :-)