mismatched input 'from'. Expecting: ',', <expression> - sql

I have a query that I am running on AWS athena that should return all the filenames that are not contained in the second table. I am basically trying to find all the filename that are not in ejpos landing table.
The one table looks like this (item sales):
origin_file
run_id
/datarite/ejpos/8023/20220706/filename1
8035
/datarite/ejpos/8023/20220706/filename2
8035
/datarite/ejpos/8023/20220706/filename3
8035
The other table looks like this (ejpos_files_landing):
filename
filename1
filename2
filename3
filename4
They don't have the same number of rows, hence I am trying to find the file names that are in ejpos_pos_landing but not in item sales table.
I get this error when I run:
mismatched input 'from'. Expecting: ',', <expression>
The query is here:
SELECT trim("/datarite/ejpos/8023/20220706/" from "validated"."datarite_ejpos_itemsale" where
run_id = '8035') as origin_file,
FROM "validated"."datarite_ejpos_itemsale"
LEFT JOIN "landing"."ejpos_landing_files" ON "landing"."ejpos_landing_files".filename =
"validated"."datarite_ejpos_itemsale".origin_file
WHERE "landing"."ejpos_landing_files".filename IS NULL;
The expected result would be:
|filename4|
Because it is not in the other table
Can anyone assist?

There is a lot of wrong stuff in your query based on the example data and declared goals.
trim("/datarite/ejpos/8023/20220706/" from "validated"."datarite_ejpos_itemsale" where run_id = '8035') as origin_file is not a valid sql.
ON "landing"."ejpos_landing_files".filename = "validated"."datarite_ejpos_itemsale".origin_file will not work cause origin_file is prefixed. You can use strpos if there should be only one instance of filename in the origin_file.
your join and filtering condition are build to find items present in datarite_ejpos_itemsale and missing in ejpos_landing_files while you state the vise versa is needed.
the mentioned in the comments extra comma
Try next:
-- sample data
WITH item_sales(origin_file, run_id) AS (
VALUES ('/datarite/ejpos/8023/20220706/filename1', 8035),
('/datarite/ejpos/8023/20220706/filename2', 8035),
('/datarite/ejpos/8023/20220706/filename3', 8035),
('/datarite/ejpos/8023/20220706/filename4', 8036)
),
ejpos_files_landing(filename) as(
VALUES ('filename1'),
('filename2'),
('filename3'),
('filename4')
)
-- query
select filename
from ejpos_files_landing l
left outer join item_sales s -- reverse the join
on strpos(s.origin_file, l.filename) >= 1 -- assuming that filename should be present only one time in the string
and s.run_id = 8035 -- if you need to filter out run id
where s.origin_file is null
Output:
filename
filename4
Alternative approach you can try:
-- query
select filename
from ejpos_files_landing l
where filename not in (
select element_at(split(origin_file, '/'), -1) -- split by '/' and get last
from item_sales
where run_id = 8035
)

Related

How to exclude SQL variable in output

I have a complex SQL query where I have a few cases that use END AS variableName. I then use variableName to do some logic and then create a new variable which I want in the output result. However when I run the query, all the END AS variableNames that I have used are also outputted in the results.
Is there a way that I can exclude these variables as I only want the final variable that uses these variableNames.
Thanks
EDIT, here is a query explaining my problem
SELECT DISTINCT
mt.src_id AS “SRC_ID”,
CASE
WHEN mt.cd = ‘TAN’ THEN
(
(
SELECT SUM(src_amt)
FROM source_table st
WHERE mt.id = st.id
AND st._cd = ‘TAN’
AND st.amt_cd = ‘ABL’)
)
END AS src_amt
FROM MAIN_TABLE mt
WHERE
mf.dt >= 2021-12-12
AND SRC_AMT > 10
I need SRC_AMT to be used as some sort of logic but when I run the query, it prints out in the output as it's own column. I want to ignore this variable
you can wrap the whole thing into a new select-statement:
select SRC_ID from ( <entire previous query here> )

Spatial Query Postgis

I have a polygon city and polygon data that I import into PostgreSQL, PostGIS. These intersect with cities. The first thing I need to do is to print the id from the city table to the other table, but while doing this, it needs to get the id of the city where the polygon is located. I tried a few functions to do this but got an error. Can you help me design the SQL command line?
update maden_polygon set objectid = maden_polygon.ilce_id
from (SELECT maden_polygon.ilce_id as id ,ankara_ilce.objectid as ilce_id
FROM maden_polygon , ankara_ilce
WHERE st_intersects(maden_polygon.geom, ankara_ilce.geom)) as maden_polygon
where maden_polygon.ilce_id = anakara_ilce.object_id
(ERROR: table name "maden_polygon" specified more than once )
What I want to do is to print the objectid column in the ankara_ilce table to the mine_polygon ilce_id table.
While doing this,
Write the object_id of which mine is within the boundaries of which county.
SELECT
maden_polygon.ilce_id as id ,
ankara_ilce.objectid as ad ,
ankara_ilce.adi as adi
from maden_polygon , ankara_ilce
where St_intersects(ankara_ilce.geom , maden_polygon.geom ) as sorgu
where maden_polygon.id = sorgu.id ;
ERROR: syntax error at or near "as"
LINE 6: ...ntersects(ankara_ilce.geom , maden_polygon.geom ) as sorgu
I think the query is a simple as this:
UPDATE maden_polygon set objectid = ilce_id
FROM ankara_ilce
WHERE st_intersects(maden_polygon.geom, ankara_ilce.geom)
BUT - note that the st_intersects can return multiple records per maden_polygon if your polygons overlap, and that might give you inconsistent results. You could try using st_contains instead (being aware that some records might not update that way). OR, you could match on the centroid of the one polygon e.g.
UPDATE maden_polygon set objectid = ilce_id
FROM ankara_ilce
WHERE st_within(st_centroid(maden_polygon.geom), ankara_ilce.geom)
Good luck!

How can I count all NULL values, without column names, using SQL?

I'm reading and executing sql queries from file and I need to inspect the result sets to count all the null values across all columns. Because the SQL is read from file, I don't know the column names and thus can't call the columns by name when trying to find the null values.
I think using CTE is the best way to do this, but how can I call the columns when I don't know what the column names are?
WITH query_results AS
(
<sql_read_from_file_here>
)
select count_if(<column_name> is not null) FROM query_results
If you are using Python to read the file of SQL statements, you can do something like this which uses pglast to parse the SQL query to get the columns for you:
import pglast
sql_read_from_file_here = "SELECT 1 foo, 1 bar"
ast = pglast.parse_sql(sql_read_from_file_here)
cols = ast[0]['RawStmt']['stmt']['SelectStmt']['targetList']
sum_stmt = "sum(iff({col} is null,1,0))"
sums = [sum_sql.format(col = col['ResTarget']['name']) for col in cols]
print(f"select {' + '.join(sums)} total_null_count from query_results")
# outputs: select sum(iff(foo is null,1,0)) + sum(iff(bar is null,1,0)) total_null_count from query_results

Postgresql : How to update one field for all duplicate values based at the end of the string of a field except one row

http://sqlfiddle.com/#!9/b98ea/1 (Sample Table)
I have a table with the following fields:
transfer_id
src_path
DH_USER_ID
email
status_state
ip_address
src_path field contains a couple of duplicates filename values but a different folder name at the beginning of the string.
Example:
191915/NequeVestibulumEget.mp3
/191918/NequeVestibulumEget.mp3
191920/NequeVestibulumEget.mp3
I am trying to do the following:
Set status_state field to 'canceled' for all the duplicate filenames within (src_path) field except for one.
I want the results to look like this:
http://sqlfiddle.com/#!9/5e65f/2
*I apologize in advance for being a complete noob, but I am taking SQL at college and I need help.
SQL Fiddle Demo
fix_os_name: Fix the windows path string to unix format.
file_name: Split the path using /, and use char_length to bring last split.
drank: Create a seq for each filename. So unique filename only have 1, but dup also have 2,3 ...
UPDATE: check if that row have rn > 1 mean is a dup.
.
Take note the color highlight is wrong, but code runs ok.
with fix_os_name as (
SELECT transfer_id, replace(src_path,'\','/') src_path,
DH_USER_ID, email, status_state, ip_address
FROM priority_transfer p
),
file_name as (
SELECT
fon.*,
split_part(src_path,
'/',
char_length(src_path) - char_length(replace(src_path,'/','')) + 1
) sfile
FROM fix_os_name fon
),
drank as (
SELECT
f.*,
row_number() over (partition by sfile order by sfile) rn
from file_name f
)
UPDATE priority_transfer p
SET status_state = 'canceled'
WHERE EXISTS ( SELECT *
FROM drank d
WHERE d.transfer_id = p.transfer_id
AND d.rn > 1);
ADD: One row is untouch
Use the regexp_matches function to separate the file name from the directory.
From there you can use distinct() to build a table with unique values for the filename.
select
regexp_matches(src_path, '[a-zA-Z.0-9]*$') , *
from priority_transfer
;

How can I run a SQL query iteratively for every row in a table?

I have the following query:
DECLARE #AccString varchar(max)
SET #AccString=''
SELECT #Acctring=#AccString + description + ' [ ] '
FROM tl_sb_accessoryInventory ai
JOIN tl_sb_accessory a on a.accessoryID = ai.accessoryID
WHERE userID=6
SELECT userID, serviceTag, model, #AccString AS ACCESSORIES FROM tl_sb_oldLaptop ol
JOIN tl_sb_laptopType lt ON ol.laptopTypeID = lt.laptopTypeID
WHERE userID=6
which outputs this:
What I want to be able to do is run this for every userID in a table tl_sb_user.
The statement to get the userIDs is:
Select userID from tl_sb_user
How can I get this to output a row as above for each user?
You are trying to do a string concatenation subquery. In SQL Server, you need to do the string concatenation using a correlated subquery with for xml path. Arcane, but it generally works.
The results is something like this:
SELECT userID, serviceTag, model, #AccString AS ACCESSORIES,
stuff((select ' [ ] ' + description
from tl_sb_accessoryInventory ai join
tl_sb_accessory a
on a.accessoryID = ai.accessoryID
where a.userId = ol.UserId
for xml path ('')
), 1, 11, '') as accessories
FROM tl_sb_oldLaptop ol JOIN
tl_sb_laptopType lt
ON ol.laptopTypeID = lt.laptopTypeID;
You don't have table aliases identifying where the columns come from, so I am just guessing that a.userId = ol.UserId references the right tables.
Also, this substitutes certain characters with html forms. Notably '<' and '>' turn into things like '<' and '>'. When I encounter this problem, I use replace() to replace the values.
Simply leave out the WHERE clause.