Convert ISO Date to DD-MON-YYYY in a csv file in unix - awk

I've csv file containing date in ISO format like below.
id,x1,x2,x3
AIR,Partner,2015-10-20T04:00:00.000Z,2015-10-20T04:00:00.000Z,2016-02-12T05:00:00.000Z
CMX,Partner,Tier,2017-03-23T04:00:00.000Z
WKA,Partner,Tier,2017-05-22T04:00:00.000Z
APP,Partner,2017-10-04T04:00:00.000Z,Tier
BUN,2017-09-27T04:00:00.000Z,Partner,,2017-09-27T04:00:00.000Z
There is no fix column for date it can appear in any column except 1st column.
I want to convert all occurrence of ISO date into DD-MON-YYYY or DD/MM/YYYY format.
Please help.

You can use the following through a UNIX pipe:
sed -E 's#\b([0-9]{4})-([0-9]{2})-([0-9]{2})T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}Z\b#\3/\2/\1#g'
Another option would be the following PCRE, which explicitly enforces the , separator at the start (as you mention this field cannot be the first one), and either , or EOL at the end of the matched expression:
cat | perl -pe 's#,\K([0-9]{4})-([0-9]{2})-([0-9]{2})T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}Z(?=,|$)#$3/$2/$1#g'

You can combine the previous answer with How to Print Spelled out month names rather than numbers to get the following two options, depending on which of the two answers you like best:
The first option is to not use any libraries, but just use the direct embedding of the names of the months:
cat 48941818.txt | perl -pe '#m = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec); s#,\K([0-9]{4})-([01][0-9])-([0-3][0-9])T[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}Z(?=,|$)#$3-$m[$2]-$1#g'
The second option is to use http://perldoc.perl.org/POSIX.html#strftime as in C, and treat the replace part as perl code with the e modifier, as per Embedding evaluations in Perl regex, and, ultimately, as documented at http://perldoc.perl.org/perlrequick.html#Search-and-replace, which makes it extra flexible as you can do further improvements to the date format (although on my system %v didn't work, and %e had extra space in one-digit days).
% cat 48941818.txt | perl -pe 'use POSIX qw(strftime); s#,\K([0-9]{4})-([01][0-9])-([0-3][0-9])T([0-9]{2}):[0-9]{2}:[0-9]{2}.[0-9]{3}Z(?=,|$)#POSIX::strftime(qw/%e-%b-%Y/,0,0,$4,$3,$2-1,$1-1900)#ge'
id,x1,x2,x3
AIR,Partner,20-Oct-2015,20-Oct-2015,12-Feb-2016
CMX,Partner,Tier,23-Mar-2017
WKA,Partner,Tier,22-May-2017
APP,Partner, 4-Oct-2017,Tier
BUN,27-Sep-2017,Partner,,27-Sep-2017
%

Related

jqwidgets-datetimeinput date format d MMMM yyyy not displayed correctly

I am using jqwidgets-datetimeinput to display and input formatted date. It is working fine for most of the use cases but for the format d MMMM yyyy it is not behaving correctly.
it is showing as 6/11/2020 June 2020 instead of 11 June 2020
Interesting. I think you just found one of the many bugs/"features" that plague jqWidgets.
Upon further investigation, this seems to happen when d is the end of the format string, or is followed by a space character. To get around this, you can use a non-breaking space:
'd\u00a0MMMM yyyy'
If you're on Windows, you can also do Alt+0160 to type it out, but this may be unwise, as the two characters look identical.

How to remove unix timestamp specific data from a flatfile

I have a huge file containing a list like this
email#domain.com^B1569521698
email2#domain.com,#2domain.com^B1569521798
email3#domain.com,test#2domain.com^B1569521898
email10000#domain.com^B1569521998
..
..
The file is named /usr/local/email/whitelist
The number after ^B is a unix timestamp
I need to remove from the list all the rows having a timestamp smaller than
(e.g.) 1569521898.
I tried using various awk/sed combinations with no result.
The character ^B you notice is a control character. The first 32 control-characters which are ASCII codes 0 through 1FH, form a special set of non-printing characters. These characters are called the control characters because these characters perform various printer and display control operations rather than displaying symbols. This particular one stands for STX or Start of Text.
You can type control-charcters in a shell as: Ctrl+v Ctrl+b, or you can use the octal representation directly (\002).
awk -F '\002' '($2 >= 1569521898)'
Since you have control characters in your Input_file could you please try following once. This is written and tested with given samples only.
awk '
match($0,/\002[0-9]+/){
val=substr($0,RSTART+1,RLENGTH-1)
if(val>=1569521898){ print }
val=""
}
' Input_file

What are the two numbers in hapi's server.log or request.log?

When I use server.log or request.log I see two numbers appear at the beginning of each line, what exactly are these two numbers?
server.log(['info'], 'hello world');
Output:
151230/205853.557, [log,info], data: hello world
I'm assuming this is a process id and a timestamp? "151230/205853.557" How do I interpret these numbers?
This is the default Date/Time output format of the GoodConsole reporter for Good.
It can be changed by setting the format option:
From the docs:
format - MomentJS format string. Defaults to 'YYMMDD/HHmmss.SSS'.
So 151230/205853.557 means 30th December 2015 at 20:58:33 and 557ms.

Grep for Multiple instances of string between a substring and a character?

Can you please tell me how to Grep for every instance of a substring that occurs multiple times on multiple lines within a file?
I've looked at
https://unix.stackexchange.com/questions/131399/extract-value-between-two-search-patterns-on-same-line
and How to use sed/grep to extract text between two words?
But my problem is slightly different - each substring will be immediately preceded by the string: name"> and will be terminated be a < character immediately after the last character of the substring I want.
So one line might be
<"name">Bob<125><adje></name><"name">Dave<123><adfe></name><"name">Fred<125><adfe></name>
And I would like the output to be:
Bob
Dave
Fred
Although awk is not the best tool for xml processing, it will help if your xml structure and data simple enough.
$ awk -F"[<>]" '{for(i=1;i<NF;i++) if($i=="\"name\"") print $(++i)}' file
Bob
Dave
Fred
I doubt that the tag is <"name"> though. If it's <name>, without the quotes change the condition in the script to $i=="name"
gawk
awk -vRS='<"name">|<' '/^[A-Z]/' file
Bob
Dave
Fred

Unix cut in c shell

in file ~/x,
--- //zep/arod/jo/new/ded/main/changes 2013-05-13 17:14:34.000000000 -0700
--- //zep/arod/jo/new/ded/main/lib/soph/tool.py 2013-05-16 14:14:34.000000000 -0700
--- //zep/arod/jo/new/ded/main/lib/soph/pomp.py 2013-05-16 14:14:34.000000000 -0700
in c shell,
set F=`grep '^---' ~/x | cut -d/ -f7-99 | cut and somehow cut number`
then, ls $F should give
ded/main/changes
ded/main/lib/soph/tool.py
ded/main/lib/soph/pomp.py
I dont quite understand the -f tag and not sure how to cut the timestamp part
any suggestions?
-f7-99 means "include fields 7 through 99" (which in this case, they probably just meant -f7- which would give all fields 7 and up).
cut divides each line up into fields, based on the divider (which is what -d/ is specifying - the divider in that case is the / character). It then returns the fields that you ask it for (in your example, 7 through 99).
Your second cut command could probably be cut -d' ' -f1 which would use a divider of spaces and only give you the first field (in other words, everything before the first space, which would be just the path).