awk replace serialized number lines and move up other lines - awk

I have a file that has the following format
1 - descrio #944
name
address
2 - desanother #916
name
address
3 - somedes #957
name
address
and i want to get the output as,
Usercode #944, name, address
Usercode #916, name, address
Usercode #957, name, address

With awk
awk 'NR%3 == 1{sub(/^.*#/, "Usercode #")};{ORS=NR%3?", ":"\n"};1' file
Usercode #944, name, address
Usercode #916, name, address
Usercode #957, name, address
For a variable number of rows
awk -v RS='(^|\n)[[:digit:]]+[[:blank:]]*-[[:blank:]]*' '{sub(/\n$/, "");
gsub(/\n/, ", "); printf "%s", $0""RT}END{print ""}' file

If you do not have # in any of your descriptions, try:
sed -e 's/.*#/Usercode #/;N;N;s/\n/, /g' input

You may try this command also,
$ paste -d'~' - - - < ccc | sed 's/^[^#]*/Usercode /g;s/~/, /g'
Usercode #944, name, address
Usercode #916, name, address
Usercode #957, name, address

Related

Is there a query using SQLite to count valid unique email address from 3 separate email address field To, CC, BCC?

I have the following query I'm working with which results 1 per row but I know there is more than one email address stored within the field separated by semicolon
SELECT UID, EmailToField,
EmailToField REGEXP '[a-zA-Z0-9+._-]+#[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+' AS valid_emailTo
FROM table
For example my DB have
UID
EmailTo
EmailCC
EmailBCC
001
emailTo_1#domain.com; emailTo_2#domain.com
emailCC_1#domain.com
EmailBcc1#domain.com
Expecting results to show
UID
validEmailToCcBcc_count
001
4
Used AWK instead of SQL to obtain results, the following worked!
awk '{print NR " " gsub(/[a-zA-Z0-9+._-]+#[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+/, "")}' test.csv > results.csv

strugling with awk script need help to done this just need your suggestion or logic

I have a sql file to filter the data
-- Edit this file by adding your SQL below each question.
-------------------------------------------------------------------------------
-------------------------------------------------------------
-- The following queries are based on the 1994 census data.
-------------------------------------------------------------
.read 1994
-census-summary-1.sql
-- 4. what is the average age of people from China?
select avg(age)
from census
where native_country ='China';
-- 5. what is the average age of people from Taiwan?
select avg(age)
from census
where native_country ='Taiwan';
-- 6. which native countries have "land" in their name?
select distinct(native_country)
from census
where native_country like '%land%';
--------------------------------------------------------------------------------------
-- The following queries are based on the courses-ddl.sql and courses-small.sql data
--------------------------------------------------------------------------------------
drop table census;
.read courses-ddl.sql
.read courses-small-1.sql
-- 11. what are the names of all students who have taken some course? Don't show duplicates.
select distinct(name)
from student
where tot_cred > 0;
-- 12. what are the names of departments that offer 4-credit courses? Don't list duplicates.
select distinct(dept_name)
from course
where credits=4;
-- 13. What are the names and IDs of all students who have received an A in a computer science class?
select distinct(name), id
from student natural join takes natural join course
where dept_name="Comp. Sci." and grade="A";
if I run
./script.awk -v ID=6 file.sql
Note that the problem id is passed to the awk script as variable ID on the command line, like this:
-v ID=6
How Can I get the result like
Result :
select distinct(native_country) from census where native_country like '%land%';
With your shown samples and in GNU awk, please try following GNU awk code using its match function. Where id is an awk variable has value which you want to make sure should be checked in lines of your Input_file. Also I have used exit to get/print the very first match and get out of program to save some time/cycle, in case you have more than one matches then simply remove it from following code.
awk -v RS= -v id="6" '
match($0,/(\n|^)-- ([0-9]+)\.[^\n]*\n(select[^;]*;)/,arr) && arr[2]==id{
gsub(/\n/,"",arr[3])
print arr[3]
exit
}
' Input_file
One option with awk could be matching the start of the line with -- 6. where 6 is the ID.
Then move to the next line, and set a variable that the start of the part that you want to match is seen
Then print all lines that do not start with a space and are seen.
Set seen to 0 when encountering an "empty" line
Concatenate the lines that you want in the output as a single line, and at the end remove the trailing space.
gawk -v ID=6 '
match($0, "^-- "ID"\\.") {
seen=1
next
}
/^[[:space:]]*$/ {
seen=0
}
seen {
a = a $0 " "
}
END {
sub(/ $/, "", a)
print a
}
' file.sql
Or as a single line
gawk -v ID=6 'match($0,"^-- "ID"\\."){seen=1;next};/^[[:space:]]*$/{seen=0};seen{a=a$0" "};END{sub(/ $/,"",a);print a}' file.sql
Output
select distinct(native_country) from census where native_country like '%land%';
Another option with gnu awk setting the row separator to an "empty" line and using a regex with a capture group to match all lines after the initial -- ID match that do not start with a space
gawk -v ID=6 '
match($0, "\\n-- "ID"\\.[^\\n]*\\n(([^[:space:]][^\\n]*(\\n|$))*)", m) {
gsub(/\n/, " ", m[1])
print m[1]
}
' RS='^[[:space:]]*$' file

AWK multiline match

I have read more topics about my problem but nothing has resolved that!
I have these text in my_text file:
Address dlkjfhadvkahvealkjvhfelkafver
Phone 4752935729527297
Discription fkdshkhglkhrtlghltkg
Misc 5897696h8ghgvjhgh578hg
Address klsfghtrgjgjktsrljgsjgm
Phone 5789058309809583
Discription dskjfvhfhgjvnwmrew
Misc h09v3n3vt7957jt795783hj
.....
.....
.....
And I want to filter this file data by 3 (or more) line value such as Address, Phone, Misc.
I test awk '/Address/,/Phone/,/Misc/' my_text but error!
You need to use or | operator.
Matched line:
awk '/Address|Phone|Misc/{ print $0 }' your_text
Result:
Address dlkjfhadvkahvealkjvhfelkafver
Address dlkjfhadvkahvealkjvhfelkafver
Phone 4752935729527297
Misc 5897696h8ghgvjhgh578hg
Address klsfghtrgjgjktsrljgsjgm
Phone 5789058309809583
Misc h09v3n3vt7957jt795783hj
if you print $1 you get Address, phone or what you matched. And $2 will print your values only.

awk move line up if not matching pattern

I am new to awk and sed. I have the following lines and want to move the line up if it does not match pattern.
File:
company name
address line
city, state, zip
extra info
company name
address line
city, state, zip
extra info
company name
address line
city, state, zip
extra info
... and it goes on like that
want to use pattern matching 'company name' . if the line does not have 'company name' move the line up.
Desired output:
company name, address line, city, state, zip, extra info
company name, address line, city, state, zip, extra info
company name, address line, city, state, zip, extra info
... and continue on
Thanks for any help
Here us how to do it with awk
awk '{printf "%s"(NR%4?", ":RS),$0}' file
company name, address line, city, state, zip, extra info
company name, address line, city, state, zip, extra info
company name, address line, city, state, zip, extra info
For every 4 line, use RS, else use ,
Or as Jaypal suggested:
awk '{ORS=(NR%4?", ":RS)}1' file
$ awk '{printf "%s%s", (/company name/?rs:", "), $0; rs=RS} END{print ""}' file
company name, address line, city, state, zip, extra info
company name, address line, city, state, zip, extra info
company name, address line, city, state, zip, extra info
paste is a good tool for this job (assuming you are ok with , as a delimiter instead of , followed by space)
<file paste -d',' - - - -
company name,address line,city, state, zip,extra info
company name,address line,city, state, zip,extra info
company name,address line,city, state, zip,extra info
Alternately
<file paste -s -d',,,\n'
You could try this awk command also,
awk 'BEGIN{RS="company"}{ gsub (/\n/,", ");} NR>=2 {sub (/, $/,""); print RS$0}' file
Example:
$ cat file
company name
address line
city, state, zip
extra info
company name
address line
city, state, zip
extra info
company name
address line
city, state, zip
extra info
$ awk 'BEGIN{RS="company"}{ gsub (/\n/,", ");} NR>=2 {sub (/, $/,""); print RS$0}' file
company name, address line, city, state, zip, extra info
company name, address line, city, state, zip, extra info
company name, address line, city, state, zip, extra info

Separate text and pass it to a SQL

I'm using the latest Debian version.
I have this file:
2301,XT_ARTICLES
2101,XT_HOUSE_PHOTOS
301,XT_PDF
101611,XT_FIJOS
I want to separate this text so I can add the ID and the name to a one SQL. The SQL must be repeated according to the number of lines in the file, but I don't know how can I do it.
Can anybody help me, please?
Is this fit your needs ?
awk -F',' '{print "INSERT INTO foobar VALUES("$1,",\047"$2"\047);"}' file.txt
INSERT INTO foobar VALUES(2301, 'XT_ARTICLES');
INSERT INTO foobar VALUES(2101, 'XT_HOUSE_PHOTOS');
INSERT INTO foobar VALUES(301, 'XT_PDF');
INSERT INTO foobar VALUES(101611, 'XT_FIJOS');
If it's ok, just pipe that in MySQL :
awk -F',' '
BEGIN{
print "USE qux;"
}
{
print "INSERT INTO foobar VALUES("$1,",\047"$2"\047);"
}' file.txt | mysql