i have this query
insert into changes (id_registro)
select d2.id_registro
from daily2 d2
where exists (
select 1
from daily d1
where
d1.id_registro = d2.id_registro
and (d2.origen, d2.sector, d2.entidad_um, d2.sexo, d2.entidad_nac, d2.entidad_res,
d2.municipio_res, d2.tipo_paciente,d2.fecha_ingreso, d2.fecha_sintomas,
d2.fecha_def, d2.intubado, d2.neumonia, d2.edad, d2.nacionalidad, d2.embarazo,
d2.habla_lengua_indig, d2.diabetes, d2.epoc, d2.asma, d2.inmusupr, d2.hipertension,
d2.otra_com, d2.cardiovascular, d2.obesidad,
d2.renal_cronica, d2.tabaquismo, d2.otro_caso, d2.resultado, d2.migrante,
d2.pais_nacionalidad, d2.pais_origen, d2.uci )
<>
(d1.origen, d1.sector, d1.entidad_um, d1.sexo, d1.entidad_nac, d1.entidad_res,
d1.municipio_res, d1.tipo_paciente, d1.fecha_ingreso, d1.fecha_sintomas,
d1.fecha_def, d1.intubado, d1.neumonia, d1.edad, d1.nacionalidad, d1.embarazo,
d1.habla_lengua_indig, d1.diabetes, d1.epoc, d1.asma, d1.inmusupr, d1.hipertension,
d1.otra_com, d1.cardiovascular, d1.obesidad,
d1.renal_cronica, d1.tabaquismo, d1.otro_caso, d1.resultado, d1.migrante,
d1.pais_nacionalidad, d1.pais_origen, d1.uci ))
it results in an insersion data that doesn't exist in another table, that's fine. but i want know exactly which field has changed to store it in a log table
You don't mention precisely what you expect to see in your output but basically to accomplish what you're after you'll need a long sequence of CASE clauses, one for each column
e.g. one approach might be to create a comma-separated list of the column names that have changed:
INSERT INTO changes (id_registro, column_diffs)
SELECT d2.id_registro,
CONCAT(
CASE WHEN d1.origen <> d2.origen THEN 'Origen,' ELSE '' END,
CASE WHEN d1.sector <> d2.sector THEN 'Sector,' ELSE '' END,
etc.
Within the THEN part of the CASE you can build whatever detail you want to show
e.g. a string showing before and after values of the columns CONCAT('Origen: Was==> ', d1.origen, ' Now==>', d2.origen). Presumably though you'll also need to record the times of these changes if there can be multiple updates to the same record throughout the day.
Essentially you'll need to decide what information you want to show in your logfile, but based on your example query you should have all the information you need.
As in title, there is an error in my first code in FOR loop: Command contains unrecognized phrase. I am thinking if the method string+variable is wrong.
ALTER TABLE table1 ADD COLUMN prod_n c(10)
ALTER TABLE table1 ADD COLUMN prm1 n(19,2)
ALTER TABLE table1 ADD COLUMN rbon1 n(19,2)
ALTER TABLE table1 ADD COLUMN total1 n(19,2)
There are prm2... until total5, in which the numbers represent the month.
FOR i=1 TO 5
REPLACE ALL prm+i WITH amount FOR LEFT(ALLTRIM(a),1)="P" AND
batch_mth = i
REPLACE ALL rbon+i WITH amount FOR LEFT(ALLTRIM(a),1)="R"
AND batch_mth = i
REPLACE ALL total+i WITH sum((prm+i)+(rbon+i)) FOR batch_mth = i
NEXT
ENDFOR
Thanks for the help.
There are a number of things wrong with the code you posted above. Cetin has mentioned a number of them, so I apologize if I duplicate some of them.
PROBLEM 1 - in your ALTER TABLE commands I do not see where you create fields prm2, prm3, prm4, prm5, rbon2, rbon3, etc.
And yet your FOR LOOP would be trying to write to those fields as the FOR LOOP expression i increases from 1 to 5 - if the other parts of your code was correct.
PROBLEM 2 - You cannot concatenate a String to an Integer so as to create a Field Name like you attempt to do with prm+i or rbon+1
Cetin's code suggestions would work (again as long as you had the #2, #3, etc. fields defined). However in Foxpro and Visual Foxpro you can generally do a task in a variety of ways.
Personally, for readability I'd approach your FOR LOOP like this:
FOR i=1 TO 5
* --- Keep in mind that unless fields #2, #3, #4, & #5 are defined ---
* --- The following will Fail ---
cFld1 = "prm" + STR(i,1) && define the 1st field
cFld2 = "rbon" + STR(i,1) && define the 2nd field
cFld3 = "total" + STR(i,1) && define the 3rd field
REPLACE ALL &cFld1 WITH amount ;
FOR LEFT(ALLTRIM(a),1)="P" AND batch_mth = i
REPLACE ALL &cFld2 WITH amount ;
FOR LEFT(ALLTRIM(a),1)="R" AND batch_mth = i
REPLACE ALL &cFld3 WITH sum((prm+i)+(rbon+i)) ;
FOR batch_mth = i
NEXT
NOTE - it might be good if you would learn to use VFP's Debug tools so that you can examine your code execution line-by-line in the VFP Development mode. And you can also use it to examine the variable values.
Breakpoints are good, but you have to already have the TRACE WINDOW open for the Break to work.
SET STEP ON is the Debug command that I generally use so that program execution will stop and AUTOMATICALLY open the TRACE WINDOW for looking at code execution and/or variable values.
Do you mean you have fields named prm1, prm2, prm3 ... prm12 that represent the months and you want to update them in a loop? If so, you need to understand that a "fieldName" is a "name" and thus you need to use a "name expression" to use it as a variable. That is:
prm+i
would NOT work but:
( 'pro'+ ltrim(str(m.i)) )
would.
For example here is your code revised:
For i=1 To 5
Replace All ('prm'+Ltrim(Str(m.i))) With amount For Left(Alltrim(a),1)="P" And batch_mth = m.i
Replace All ('rbon'+Ltrim(Str(m.i))) With amount For Left(Alltrim(a),1)="R" And batch_mth = m.i
* ????????? REPLACE ALL ('total'+Ltrim(Str(m.i))) WITH sum((prm+i)+(rbon+i)) FOR batch_mth = i
Endfor
However, I must admit, your code doesn't make sense to me. Maybe it would be better if you explained what you are trying to do and give some simple data with the result you expect (as code - you can use FAQ 50 on foxite to create code for data).
I am using the sqlquery function in R to connect the DB with R. I am using the following lines
for (i in 1:length(Counter)){
if (Counter[i] %in% str_sub(dir(),1,29) == FALSE){
DT <- data.table(sqlQuery(con, paste0("select a.* from edp_data.sme_loan a
where a.edcode IN (", print(paste0("\'",EDCode,"\'"), quote=FALSE),
") and a.poolcutoffdate in (",print(paste0("\'",str_sub(PoolCutoffDate,1,4),"-",str_sub(PoolCutoffDate,5,6),"-",
str_sub(PoolCutoffDate,7,8),"\'"), quote=FALSE),")")))}}
Thus I am importing subsets of the DB by EDCode and PoolCutoffDate. This works perfectly, however there is one variable in edp_data.sme in one particular EDCode which produces an undesired result.
If I take the unique of this "as.3" variable for a particular EDCode I get:
unique(DT$as3)
[1] 30003000000000019876240886000 30003000000000028672000424000
In reality there shoud be more unique IDs in this DB. The problem is that the string of as3 is much longer than the one which is imported.
nchar(unique(DT$as3))
[1] 29 29
How can I import more characters from this string? I do not want to specify each variable in select a.* ideally, but only make sure that it imports the full string of as3.
Any help is appreciated!
I'm looking for a more efficient way to run many columns updates on the same table like this:
UPDATE TABLE table
SET col = regexp_replace( col, 'foo', 'bar' )
WHERE regexp_match( col, 'foo' );
Such that foo, and bar, will be a combination of 40 different regex-replaces. I doubt even 25% of the dataset needs to be updated at all, but what I'm wanting to know is it is possible to cleanly achieve the following in SQL.
A single pass update
A single match of the regex, triggers a single replace
Not running all possible regexp_replaces if only one matches
Not updating all columns if only one needs the update
Not updating a row if no column has changed
I'm also curious, I know in MySQL (bear with me)
UPDATE foo SET bar = 'baz'
Has an implicit WHERE bar != 'baz' clause
However, in PostgreSQL I know this doesn't exist: I think I could at least answer one of my questions if I knew how to skip a single row's update if the target columns weren't updated.
Something like
UPDATE TABLE table
SET col = *temp_var* = regexp_replace( col, 'foo', 'bar' )
WHERE col != *temp_var*
Do it in code. Open up a cursor, then: grab a row, run it through the 40 regular expressions, and if it changed, save it back. Repeat until the cursor doesn't give you any more rows.
Whether you do it that way or come up with the magical SQL expression, it's still going to be a row scan of the entire table, but the code will be much simpler.
Experimental Results
In response to criticism, I ran an experiment. I inserted 10,000 lines from a documentation file into a table with a serial primary key and a varchar column. Then I tested two ways to do the update. Method 1:
in a transaction:
opened up a cursor (select for update)
while reading 100 rows from the cursor returns any rows:
for each row:
for each regular expression:
do the gsub on the text column
update the row
This takes 1.16 seconds with a locally connected database.
Then the "big replace," a single mega-regex update:
update foo set t =
regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(t,
E'\bcommit\b', E'COMMIT'),
E'\b9acf10762b5f3d3b1b33ea07792a936a25e45010\b',
E'9ACF10762B5F3D3B1B33EA07792A936A25E45010'),
E'\bAuthor:\b', E'AUTHOR:'),
E'\bCarl\b', E'CARL'), E'\bWorth\b',
E'WORTH'), E'\b\b',
E''), E'\bDate:\b',
E'DATE:'), E'\bMon\b', E'MON'),
E'\bOct\b', E'OCT'), E'\b26\b',
E'26'), E'\b04:53:13\b', E'04:53:13'),
E'\b2009\b', E'2009'), E'\b-0700\b',
E'-0700'), E'\bUpdate\b', E'UPDATE'),
E'\bversion\b', E'VERSION'),
E'\bto\b', E'TO'), E'\b2.9.1\b',
E'2.9.1'), E'\bcommit\b', E'COMMIT'),
E'\b61c89e56f361fa860f18985137d6bf53f48c16ac\b',
E'61C89E56F361FA860F18985137D6BF53F48C16AC'),
E'\bAuthor:\b', E'AUTHOR:'),
E'\bCarl\b', E'CARL'), E'\bWorth\b',
E'WORTH'), E'\b\b',
E''), E'\bDate:\b',
E'DATE:'), E'\bMon\b', E'MON'),
E'\bOct\b', E'OCT'), E'\b26\b',
E'26'), E'\b04:51:58\b', E'04:51:58'),
E'\b2009\b', E'2009'), E'\b-0700\b',
E'-0700'), E'\bNEWS:\b', E'NEWS:'),
E'\bAdd\b', E'ADD'), E'\bnotes\b',
E'NOTES'), E'\bfor\b', E'FOR'),
E'\bthe\b', E'THE'), E'\b2.9.1\b',
E'2.9.1'), E'\brelease.\b',
E'RELEASE.'), E'\bThanks\b',
E'THANKS'), E'\bto\b', E'TO'),
E'\beveryone\b', E'EVERYONE'),
E'\bfor\b', E'FOR')
The mega-regex update takes 0.94 seconds to update.
At 0.94 seconds compared to 1.16, it's true that the mega-regex update is faster, running in 81% of the time of doing it in code. It is not, however a lot faster. And ye Gods, look at that update statement. Do you want to write that, or try to figure out what went wrong when Postgres complains that you dropped a parenthesis somewhere?
Code
The code used was:
def stupid_regex_replace
sql = Select.new
sql.select('id')
sql.select('t')
sql.for_update
sql.from(TABLE_NAME)
Cursor.new('foo', sql, {}, #db) do |cursor|
until (rows = cursor.fetch(100)).empty?
for row in rows
for regex, replacement in regexes
row['t'] = row['t'].gsub(regex, replacement)
end
end
sql = Update.new(TABLE_NAME, #db)
sql.set('t', row['t'])
sql.where(['id = %s', row['id']])
sql.exec
end
end
end
I generated the regular expressions dynamically by taking words from the file; for each word "foo", its regular expression was "\bfoo\b" and its replacement string was "FOO" (the word uppercased). I used words from the file to make sure that replacements did happen. I made the test program spit out the regex's so you can see them. Each pair is a regex and the corresponding replacement string:
[[/\bcommit\b/, "COMMIT"],
[/\b9acf10762b5f3d3b1b33ea07792a936a25e45010\b/,
"9ACF10762B5F3D3B1B33EA07792A936A25E45010"],
[/\bAuthor:\b/, "AUTHOR:"],
[/\bCarl\b/, "CARL"],
[/\bWorth\b/, "WORTH"],
[/\b<cworth#cworth.org>\b/, "<CWORTH#CWORTH.ORG>"],
[/\bDate:\b/, "DATE:"],
[/\bMon\b/, "MON"],
[/\bOct\b/, "OCT"],
[/\b26\b/, "26"],
[/\b04:53:13\b/, "04:53:13"],
[/\b2009\b/, "2009"],
[/\b-0700\b/, "-0700"],
[/\bUpdate\b/, "UPDATE"],
[/\bversion\b/, "VERSION"],
[/\bto\b/, "TO"],
[/\b2.9.1\b/, "2.9.1"],
[/\bcommit\b/, "COMMIT"],
[/\b61c89e56f361fa860f18985137d6bf53f48c16ac\b/,
"61C89E56F361FA860F18985137D6BF53F48C16AC"],
[/\bAuthor:\b/, "AUTHOR:"],
[/\bCarl\b/, "CARL"],
[/\bWorth\b/, "WORTH"],
[/\b<cworth#cworth.org>\b/, "<CWORTH#CWORTH.ORG>"],
[/\bDate:\b/, "DATE:"],
[/\bMon\b/, "MON"],
[/\bOct\b/, "OCT"],
[/\b26\b/, "26"],
[/\b04:51:58\b/, "04:51:58"],
[/\b2009\b/, "2009"],
[/\b-0700\b/, "-0700"],
[/\bNEWS:\b/, "NEWS:"],
[/\bAdd\b/, "ADD"],
[/\bnotes\b/, "NOTES"],
[/\bfor\b/, "FOR"],
[/\bthe\b/, "THE"],
[/\b2.9.1\b/, "2.9.1"],
[/\brelease.\b/, "RELEASE."],
[/\bThanks\b/, "THANKS"],
[/\bto\b/, "TO"],
[/\beveryone\b/, "EVERYONE"],
[/\bfor\b/, "FOR"]]
If this were a hand-generated list of regex's, and not automatically generated, my question is still appropriate: Which would you rather have to create or maintain?
For the skip update, look at suppress_redundant_updates - see http://www.postgresql.org/docs/8.4/static/functions-trigger.html.
This is not necessarily a win - but it might well be in your case.
Or perhaps you can just add that implicit check as an explicit one?