Separate text and pass it to a SQL - sql

I'm using the latest Debian version.
I have this file:
2301,XT_ARTICLES
2101,XT_HOUSE_PHOTOS
301,XT_PDF
101611,XT_FIJOS
I want to separate this text so I can add the ID and the name to a one SQL. The SQL must be repeated according to the number of lines in the file, but I don't know how can I do it.
Can anybody help me, please?

Is this fit your needs ?
awk -F',' '{print "INSERT INTO foobar VALUES("$1,",\047"$2"\047);"}' file.txt
INSERT INTO foobar VALUES(2301, 'XT_ARTICLES');
INSERT INTO foobar VALUES(2101, 'XT_HOUSE_PHOTOS');
INSERT INTO foobar VALUES(301, 'XT_PDF');
INSERT INTO foobar VALUES(101611, 'XT_FIJOS');
If it's ok, just pipe that in MySQL :
awk -F',' '
BEGIN{
print "USE qux;"
}
{
print "INSERT INTO foobar VALUES("$1,",\047"$2"\047);"
}' file.txt | mysql

Related

strugling with awk script need help to done this just need your suggestion or logic

I have a sql file to filter the data
-- Edit this file by adding your SQL below each question.
-------------------------------------------------------------------------------
-------------------------------------------------------------
-- The following queries are based on the 1994 census data.
-------------------------------------------------------------
.read 1994
-census-summary-1.sql
-- 4. what is the average age of people from China?
select avg(age)
from census
where native_country ='China';
-- 5. what is the average age of people from Taiwan?
select avg(age)
from census
where native_country ='Taiwan';
-- 6. which native countries have "land" in their name?
select distinct(native_country)
from census
where native_country like '%land%';
--------------------------------------------------------------------------------------
-- The following queries are based on the courses-ddl.sql and courses-small.sql data
--------------------------------------------------------------------------------------
drop table census;
.read courses-ddl.sql
.read courses-small-1.sql
-- 11. what are the names of all students who have taken some course? Don't show duplicates.
select distinct(name)
from student
where tot_cred > 0;
-- 12. what are the names of departments that offer 4-credit courses? Don't list duplicates.
select distinct(dept_name)
from course
where credits=4;
-- 13. What are the names and IDs of all students who have received an A in a computer science class?
select distinct(name), id
from student natural join takes natural join course
where dept_name="Comp. Sci." and grade="A";
if I run
./script.awk -v ID=6 file.sql
Note that the problem id is passed to the awk script as variable ID on the command line, like this:
-v ID=6
How Can I get the result like
Result :
select distinct(native_country) from census where native_country like '%land%';
With your shown samples and in GNU awk, please try following GNU awk code using its match function. Where id is an awk variable has value which you want to make sure should be checked in lines of your Input_file. Also I have used exit to get/print the very first match and get out of program to save some time/cycle, in case you have more than one matches then simply remove it from following code.
awk -v RS= -v id="6" '
match($0,/(\n|^)-- ([0-9]+)\.[^\n]*\n(select[^;]*;)/,arr) && arr[2]==id{
gsub(/\n/,"",arr[3])
print arr[3]
exit
}
' Input_file
One option with awk could be matching the start of the line with -- 6. where 6 is the ID.
Then move to the next line, and set a variable that the start of the part that you want to match is seen
Then print all lines that do not start with a space and are seen.
Set seen to 0 when encountering an "empty" line
Concatenate the lines that you want in the output as a single line, and at the end remove the trailing space.
gawk -v ID=6 '
match($0, "^-- "ID"\\.") {
seen=1
next
}
/^[[:space:]]*$/ {
seen=0
}
seen {
a = a $0 " "
}
END {
sub(/ $/, "", a)
print a
}
' file.sql
Or as a single line
gawk -v ID=6 'match($0,"^-- "ID"\\."){seen=1;next};/^[[:space:]]*$/{seen=0};seen{a=a$0" "};END{sub(/ $/,"",a);print a}' file.sql
Output
select distinct(native_country) from census where native_country like '%land%';
Another option with gnu awk setting the row separator to an "empty" line and using a regex with a capture group to match all lines after the initial -- ID match that do not start with a space
gawk -v ID=6 '
match($0, "\\n-- "ID"\\.[^\\n]*\\n(([^[:space:]][^\\n]*(\\n|$))*)", m) {
gsub(/\n/, " ", m[1])
print m[1]
}
' RS='^[[:space:]]*$' file

generate SQL queries out of CSV table

Good morning,
I have a CSV file that contains in the first line the column of my tables and the rest is data. Something like that
FIELD1,FIELD2,FIELD3
data1,data2,data3
data1,data2,data3
Now I have been trying to write a script that will return the following output and can be used for more than once.
INSERT INTO tablename (FIELD1,FIELD2,FIELD3) VALUES
(data1,data2,data3)
INSERT INTO tablename (FIELD1,FIELD2,FIELD3) VALUES
(data1,data2,data3)
INSERT INTO tablename (FIELD1,FIELD2,FIELD3) VALUES
(data1,data2,data3)
That's what I have so far but it does not return the correct output.
firstline=$(printf '%s\n' 1p d wq | ed -s file.csv )
cat file.csv | while read line
do
field1=$(echo "$line" | cut -d "," -f1)
field2=$(echo "$line" | cut -d "," -f2)
field3=$(echo "$line" | cut -d "," -f3)
echo "INSERT INTO tablename ($firstline) VALUES ($fields1 $field2 $field3) ">prova.csv
done
) VALUES ( 15blename (data1,1,1
I am not sure I can use the variable $firstline inside the while loop... but I don't understand why it doesn't print me the insert into and the correct parenthesis.
Thanks in advance.
EDIT:
I have a new problem: SQL assistant does not allow me to insert values that are not enclosed with "'" so my question is how do I edit the script to make it look like this :
INSERT INTO tablename (columns) VALUES ('data1','data2','data3') ">prova.csv
thanks
Using awk:
awk 'NR==1{x=$0;next} {printf "INSERT INTO tablename (%s) VALUES (%s)\n",x,$0}' file

SQL: how to fix these errors?

So I have to loop through a folder of .dat files, extract the data and use INSERT INTO to insert the data into a database.
Here is a pastebin of one of the files to see the data I am working with:
http://pastebin.com/dn4wQjjE
To run the script I just call:
populate_database.sh directoryWithDatFiles
And the contents of the populate_database.sh script:
rm test.sql;
sqlite3 test.sql "CREATE TABLE HotelReviews (HotelID SMALLINT, ReviewID SMALLINT, Author CHAR, Content CHAR, Date CHAR, Readers SMALLINT, HelpfulReviews SMALLINT, Over$
IFS=$'\n'
for file in $1/*;
do
author=($(grep "<Author>" $file | sed 's/<Author>//g'));
content=($(grep "<Content>" $file | sed 's/<Content>//g'));
date=($(grep "<Date>" $file | sed 's/<Date>//g'));
readers=($(grep "<No. Reader>" $file | sed 's/<No. Reader>//g'));
helpful=($(grep "<No. Helpful>" $file | sed 's/<No. Helpful>//g'));
overall=($(grep "<Overall>" $file | sed 's/<Overall>//g'));
value=($(grep "<Values>" $file | sed 's/<Value>//g'));
rooms=($(grep "<Room>" $file | sed 's/<Room>//g'));
location=($(grep "<Location>" $file | sed 's/<Location>//g'));
cleanliness=($(grep "<Cleanliness>" $file | sed 's/<Cleanliness>//g'));
receptionarea=($(grep "<Check in / front desk>" $file | sed 's/<Check in \/ front desk>//g'));
service=($(grep "<Service>" $file | sed 's/<Service>//g'));
businessservice=($(grep "<Business service>" $file | sed 's/<Business service>//g'));
length=${#author[#]}
hotelID="$(echo $file | sed 's/.dat//g' | sed 's/[^0-9]*//g')";
for((i = 0; i < length; i++)); do
sqlite3 test.sql "INSERT INTO HotelReviews VALUES($hotelID, $i, 'author', 'content', 'date', ${readers[i]}, ${helpful[i]}, ${overall[i]}, 9, 10, ${location[i]}, ${cleanliness[i]}, ${receptionarea[i]}, ${service[i]}, ${businessservice[i]})";
done
done
sqlite3 test.sql "SELECT * FROM HotelReviews;"
The problem I have though, is that although much of the script is working, there are still 5 of the 15 columns that I can't get working. I'll just screenshot the errors I get when trying to change the code from:
'author' --> ${author[i]}: http://i.imgur.com/zKQLSqT.jpg
'content' --> ${content[i]}: http://i.imgur.com/pnirIo3.jpg
'date' --> ${date[i]}: http://i.imgur.com/urF5DTa.jpg
9 --> ${value[i]}: http://i.imgur.com/AnBFSWp.jpg
10 --> ${rooms[i]}: same errors as above
Anyway, if anyone could help me out on this, I'd be massively grateful.
Cheers!
If you deal with a lot of XML, I recommend getting to know a SAX parser, such as the one in the Python standard library. Anyone willing to write a shell script like that has the chops to learn it, and the result will be easier to read and at least have a prayer at being correct.
If you want to stick with regex hacking, turn to awk. Using ">" as your field separator, your script could be simplified with awk lines like
/<Author>/ { gsub(/'/, "''", $2); author=$2 }
/<Content>/ { gsub(/'/, "''", $2); content=$2 }
...
END { print author, content, ... }
The gsub takes care of your SQL quoting problem by doubling any single quotes in the data.

Awk /sed extract information when a pattern match from a paragraph

I want to search a pattern "FROM" in paragraph that begins with CREATE VIEW and ends with ";" and save the result in a csv file. for example if I have the following file :
CREATE VIEW view1
AS something
FROM table1 ,table2 as A, table3 (something FROM table4)
FROM table5, table6
USING file1
;
CREATE VIEW view2
FROM table1 ,table2 ,table6 ,table4
something
something
FROM table5 ,table7 (something FROM table4 ,table5(this is something FROM table8)
USING file2
;
I would like to have the following result:
view1;table1
view1;table2
view1;table3
view1;table4
view1;table5
view1;table6
view2;table1
view2;table2
view2;table6
view2;table4
view2;table5
view2;table7
view2;table4
view2;table5
view2;table8
I won't pretended to know the syntax of whatever follows FROM in your input file so here's how to identify the view plus split the FROM lines at commas and you can take it from there:
$ cat tst.awk
BEGIN { FS="[[:space:]]*,[[:space:]]*"; OFS=";" }
sub(/^CREATE VIEW[[:space:]]+/,"") { view = $0 }
sub(/^FROM[[:space:]]+/,"") {
for (i=1;i<=NF;i++) {
print view, $i
}
}
$ awk -f tst.awk file
view1;table1
view1;table2 as A
view1;table3 (something FROM table4)
view1;table5
view1;table6
view2;table1
view2;table2
view2;table6
view2;table4
view2;table5
view2;table7 (something FROM table4
view2;table5(this is something FROM table8)

regex to split name=value,* into csv of name,* and value,*

I would like to split a line such as:
name1=value1,name2=value2, .....,namen=valuen
two produce two lines as follows:
name1,name2, .....,namen
value1,value2, .....,valuen
the goal being to construct an sql insert along the lines of:
input="name1=value1,name2=value2, .....,namen=valuen"
namescsv=$( echo $input | sed 's/=[^,]*//g' )
valuescsv=$( echo $input | ?????? )
INSERT INTO table_name ( $namescsv ) VALUES ( $valuescsv )
Id like to do this as simply as possible - perl awk, or multiple piping to tr cut etc seems too complicated. Given the names part seems simple enough I figure there must be something similar for values but cant work it out.
You can just inverse your character match :
echo $input | sed 's/[^,]*=//g'
i think your best bet is still sed -re s/[^=,]*=([^,]*)/\1/g though I guess the input would have match your table exactly.
Note that in some RDBMS you can use the following syntax:
INSERT INTO table_name SET name=value, name2=value2, ...;
http://dev.mysql.com/doc/refman/5.5/en/insert.html
The following shell script does what you are asking for and takes care of escaping (not only because of injection, but you may want to insert values with quotes in them):
_IFS="$IFS"; IFS=","
line="name1=value1,name2=value2,namen=valuen";
for pair in $line; do
names="$names,${pair%=*}"
values="$values,'$(escape_sql "${pair#*=}")'"
done
IFS="$_IFS"
echo "INSERT INTO table_name ( ${names#,} ) VALUES ( ${values#,} )"
Output:
INSERT INTO table_name ( name1,name2,namen ) VALUES ( 'value1','value2','valuen' )