iteratively change where condition variable in redshift(pgsql)

iteratively change where condition variable in redshift(pgsql) - sql

say, I have a result of rows of a member from a simple query.
select distinct mbr_id from mbr_base where location = '17957' ;
result would look like this.
mbr_id
location
000000011441894
17957
000000011437056
17957
000000011437981
17957
000000011441312
17957
000000011440730
17957
000000011482555
17957
000000011498476
17957
this is one of the result where location condition is filtered.
Yet, I have another 49 locations to iterate as these are my distinct locations to be examined.
Finally, I would combine all this as a one table as a result table to be ready for analytics.
For example, my Psuedo-code for Python would like
df = pd.DataFrame()
for i in unique(mbr_base['location']):
rst = '''select * from where location = 'i'; '''
rst_df = pd.to_dataframe(rst)
pd.concat([df,rst_df],axis=0)
display(df)
Can you help me to write a procedure for doing this in sql(pgsql preferrably)
Many thanks;

If you want the individual locations, then use in:
select distinct mbr_id, location
from mbr_base
where location in ( '17957', . . . )
Your sample results have the location. If you just want the mbr_id, then use:
select distinct mbr_id
from mbr_base
where location in ( '17957', . . . )
Now, presumably mbr_id is unique in mbr_base. If so, remove the distinct. In addition, location looks like a number. If it really is a number, then drop the single quotes. So, what you might want is:
select mbr_id
from mbr_base
where location in ( 17957, . . . )

Related

SQL - Returning fields based on where clause then joining same table to return max value?

I have a table named Ticket Numbers, which (for this example) contain the columns:
Ticket_Number
Assigned_Group
Assigned_Group_Sequence_No
Reported_Date
Each ticket number could contain 4 rows, depending on how many times the ticket changed assigned groups. Some of these rows could contain an assigned group of "Desktop Support," but some may not. Here is an example:
Example of raw data
What I am trying to accomplish is to get the an output that contains any ticket numbers that contain 'Desktop Support', but also the assigned group of the max sequence number. Here is what I am trying to accomplish with SQL:
Queried Data
I'm trying to use SQL with the following query but have no clue what I'm doing wrong:
select ih.incident_number,ih.assigned_group, incident_history2.maxseq, incident_history2.assigned_group
from incident_history_public as ih
left join
(
select max(assigned_group_seq_no) maxseq, incident_number, assigned_group
from incident_history_public
group by incident_number, assigned_group
) incident_history2
on ih.incident_number = incident_history2.incident_number
and ih.assigned_group_seq_no = incident_history2.maxseq
where ih.ASSIGNED_GROUP LIKE '%DS%'
Does anyone know what I am doing wrong?

You might want to create a proper alias for incident_history. e.g.
from incident_history as incident_history1
and
on incident_history1.ticket_number = incident_history2.ticket_number
and incident_history1.assigned_group_seq_no = incident_history2.maxseq

In my humble opinion a first error could be that I don't see any column named "incident_history2.assigned_group".
I would try to use common table expression, to get only ticket number that contains "Desktop_support":
WITH desktop as (
SELECT distinct Ticket_Number
FROM incident_history
WHERE Assigned_Group = "Desktop Support"
),
Than an Inner Join of the result with your inner table to get ticket number and maxSeq, so in a second moment you can get also the "MAXGroup":
WITH tmp AS (
SELECT i2.Ticket_Number, i2.maxseq
FROM desktop D inner join
(SELECT Ticket_number, max(assigned_group_seq_no) as maxseq
FROM incident_history
GROUP BY ticket_number) as i2
ON D.Ticket_Number = i2.Ticket_Number
)
SELECT i.Ticket_Number, i.Assigned_Group as MAX_Group, T.maxseq, i.Reported_Date
FROM tmp T inner join incident_history i
ON T.Ticket_Number = i.Ticket_Number and i.assigned_group_seq_no = T.maxseq
I think there are several different method to resolve this question, but I really hope it's helpful for you!
For more information about Common Table Expression: https://www.essentialsql.com/introduction-common-table-expressions-ctes/

Postgresql: Update column from select and add condition when multiple rows returned

Basically, I need to update a column using a SELECT, which can return more than one value. If that happens, I'd like to apply a second condition to determine which of those values is to be chosen:
UPDATE train
SET var1 = (
CASE
WHEN (SELECT COUNT(*)
FROM cars
WHERE (train.var2 LIKE cars.var2))
> 1)
THEN (
SELECT var1
FROM cars
WHERE (train.var2 LIKE cars.var2)
AND cars.var2 in (
SELECT var2
FROM cars
WHERE train.user_id = cars.user_id)
)
ELSE (
SELECT var1
FROM cars
WHERE (train.var2 LIKE cars.var2))
)
END
);
I think the above works, but I repeat 3 times the same SELECT. Do you have a nice way to avoid that? Maybe there is a simple way to catch when the select returns more than one value and do something about it?
Thank you

update train set
var1 = (
select cars.var1
from cars
where train.var2 like cars.var2
order by train.user_id = cars.user_id desc
limit 1);

The above answer is good and works out of the box. If you do a lot of these, take a look at: https://wiki.postgresql.org/wiki/First/last_(aggregate)
Then you can do this:
update train set
var1 = (
select first(cars.var1 order by train.user_id = cars.user_id desc)
from cars
where train.var2 like cars.var2
);
Depending on your exact use-case this may be neater, easier to read, easier to reason about (order by in subselect is full of nasty edge-cases) or just more faff than it's worth.

Delete arguments from array

There is table w/ colum called Cars in this colum I have array [Audi, BMW, Toyota, ..., VW]
And I want update this table and set Cars without few elements from this array (Toyota,..., BMW)
How can I get it, I want put another array and delete elements that matched

You can unnest the array, filter, and reaggregate:
select t.*,
(select array_agg(car)
from unnest(t.cars) car
where car not in ( . . . )
) new_cars
from t;
If you want to keep the original ordering:
select t.*,
(select array_agg(u.car order by n)
from unnest(t.cars) with ordinality u(car, n)
where u.car not in ( . . . )
) new_cars
from t

You could call array_remove several times:
SELECT array_remove(
array_remove(
ARRAY['Audi', 'BMW', 'Toyota', 'Opel', 'VW'],
'Audi'
),
'BMW'
);
array_remove
------------------
{Toyota,Opel,VW}
(1 row)

Maybe I Can help using pandas in python. Assuming, you'd want to delete all the rows having the elements you'd like to delete. Lets say df is your dataframe, then,
import pandas as pd
vals_to_delete = df.loc[(df['cars']== 'Audi') | (df['cars']== 'VW')]
df = df.drop(vals_to_delete)
or you could also do
df1 = df.loc'[(df['cars']!= 'Audi') | (df['cars']!= 'VW')]
In sql, you could use
DELETE FROM table WHERE Cars in ('Audi','VW);

SAP Query IMRG Measure documents

I'm learning SAP queries.
I want to get all the Measure documents from an equipement.
To do that, I use 3 tables :
EQUI, IMPTT, IMRG
The query works but I have all documents instead I only want to get the last one by Date. But I can't do that. I'm sure that I have to add a custom field, but I have tried but none of them works.
For example, my last code :
select min( IMRG~INVTS ) IMRG~RECDV
from IMRG inner join IMPTT on
IMRG~POINT = IMPTT~POINT into (INVTS, IMRGVAL)
where IMRG~POINT = IMPTT-POINT AND
IMPTT~MPOBJ = EQUI-OBJNR
and IMRG~CANCL = '' group by IMRG~MDOCM IMRG~RECDV.
ENDSELECT.
Thanks for your help.

You will need to get the date from IMRG, and the inverted timestamp field, so the MIN() of this will be the most recent - that looks correct.
However your GROUP BY looks wrong. You should be grouping on the IMPTT~POINT field so that you get one record per measurement point. Note that one Point IMPTT can have many measurements (IMRG), so something like this:
SELECT EQUI-OBJNR, IMPTT~POINT, MIN(IMRG~IMRC_INVTS)
...
GROUP BY EQUI-OBJNR, IMPTT~POINT

If I got you correctly, you are trying to get the freshest measurement of the equipment disregard of measurement point. So you can try this query, which is not so beautiful, but it just works.
SELECT objnr COUNT(*) MIN( invts )
FROM equi AS eq
JOIN imptt AS tt
ON tt~mpobj = eq~objnr
JOIN imrg AS ig
ON ig~point = tt~point
INTO (wa_objnr, count, wa_invts)
WHERE ig~cancl = ''
GROUP BY objnr.
SELECT SINGLE recdv FROM imrg JOIN imptt ON imptt~point = imrg~point INTO wa_imrgval WHERE invts = wa_invts AND imptt~mpobj = wa_objnr.
WRITE: / wa_objnr, count, wa_invts, wa_imrgval.
ENDSELECT.

Eliminate Duplicate Rows on Outer Join

I am running a query against our Oracle database.
The goal is return the following columns -
Document Id
Document Creation Date
Organization Code
Document Status
Total Amount
The problem I am running into is with the Organization Code.
It is possible to have a document id with multiple organization codes.
I only want 1 instance - I don't care about the rest (if they exist)
Here is what I currently have -
SELECT * FROM (SELECT DISTINCT (K_HDR.DOC_HDR_ID),
K_HDR.CRTE_DT,
FS_EXT.VAL AS ORG_CODE,
REQ.REQS_STAT_CD,
FS_DOC.FDOC_TOTAL_AMT
FROM PUR_REQS_T REQ,
KREW_DOC_HDR_T K_HDR,
FS_DOC_HEADER_T FS_DOC,
KREW_DOC_HDR_EXT_T FS_EXT
WHERE REQ.FDOC_NBR = K_HDR.DOC_HDR_ID AND
FS_DOC.FDOC_NBR = REQ.FDOC_NBR AND
REQ.FDOC_NBR = FS_EXT.DOC_HDR_ID(+) AND
FS_EXT.KEY_CD(+)= 'organizationCode' AND
(K_HDR.CRTE_DT BETWEEN TO_DATE('2011-10-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')
AND
TO_DATE('2012-09-30 23:59:59', 'YYYY-MM-DD HH24:MI:SS')))
FINAL_SEARCH ORDER BY FINAL_SEARCH.CRTE_DT;
The following query returns 14,933 rows.
The correct amount of rows I should be getting is 14,789.
The culprit is the Organization Code.
For instance, as I'm looking at the result sets I see the following -
DOC_ID CRTE_DT ORG_CD STAT TOTAL
.
.
.
496256 5-OCT-11 0 CLOS 2779.89
496258 5-OCT-11 8050 CLOS 1737.5
496258 5-OCT-11 8000 CLOS 1737.5
.
.
.
How do I get rid of the annoying 2nd instance of 496258 which lives in the FS_EXT Table?
(Obviously I need to get rid of the other instances of the same type of duplicate values)

You could wrap the whole thing in one more SELECT which uses a GROUP BY to get only the MIN organization code.

So - I ended up using another column in the FS_EXT Table to further filter down to the first instance of the Org Code.
Here is what the FS_EXT Table looks like if I am looking at columns that are filtered to only show entries for Document Id = 496258.
(Mind you that there could be different number of rows for any given doc id)
DOC_HDR_EXT_ID DOC_HDR_ID KEY_CD VAL
13318096 496258 documentDescription misc items
13318098 496258 organizationDocNumber (null)
13318099 496258 statusDescription Closed
13318101 496258 chartAndOrgCodeForResult KS-1234
13318102 496258 vendorName APPLE COMPUTERS
13318103
.
.
.
.
.
13318115 496258 organizationCode 8000
13318116
.
.
.
1338118 496258 organizationCode 8050
And here is my new query which circumvents using THE JOIN OPERATION.
Notice that I use a SUBQUERY instead. To get the first instance of the OrganizationCode, I use the MIN operator on the DOC_HDR_EXT_ID column and then retrieve the organizationCode VAL using that ID and pass that back to the main QUERY.
SELECT * FROM ( SELECT DISTINCT (K_HDR.DOC_HDR_ID),
K_HDR.CRTE_DT,
(SELECT KS_EXT.VAL AS ORG_CODE
FROM KREW_DOC_HDR_EXT_T KS_EXT
WHERE KS_EXT.DOC_HDR_EXT_ID =(
SELECT MIN(DOC_HDR_EXT_ID)
FROM KREW_DOC_HDR_EXT_T FS_EXT_INNER
WHERE FS_EXT_INNER.DOC_HDR_ID = K_HDR.DOC_HDR_ID
AND FS_EXT_INNER.KEY_CD = 'organizationCode')) AS ORG_CODE,
REQ.REQS_STAT_CD,
FS_DOC.FDOC_TOTAL_AMT
FROM PUR_REQS_T REQ,
KREW_DOC_HDR_T K_HDR,
FS_DOC_HEADER_T FS_DOC,
KREW_DOC_HDR_EXT_T FS_EXT
WHERE REQ.FDOC_NBR = K_HDR.DOC_HDR_ID AND
FS_DOC.FDOC_NBR = REQ.FDOC_NBR AND
REQ.FDOC_NBR = FS_EXT.DOC_HDR_ID(+) AND
FS_EXT.KEY_CD(+)= 'organizationCode' AND
(K_HDR.CRTE_DT BETWEEN TO_DATE('2011-10-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') AND TO_DATE('2012-09-30 23:59:59', 'YYYY-MM-DD HH24:MI:SS')))
FINAL_SEARCH ORDER BY FINAL_SEARCH.CRTE_DT;
Thanks for your recommendation #Alex Poole and #StilesCrisis.
You got me thinking differently about my approach to this problem and my solutions integrates both of your suggestions. MIN approach from Stiles and filtering another column per Alex Poole.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

iteratively change where condition variable in redshift(pgsql) - sql

Related

SQL - Returning fields based on where clause then joining same table to return max value?

Postgresql: Update column from select and add condition when multiple rows returned

Delete arguments from array

SAP Query IMRG Measure documents

Eliminate Duplicate Rows on Outer Join

Categories

Resources