SQL SERVER - Complicated SQL query 2 - sql

I have two tables document and documentd the first one contains the numbers of the documents 'doc_num' primary key, document types 'doc_type' (FACA, BLCO, BLCM, BLCK .....) and the document date 'doc_date'
MANY DOCUMENTS DIFFERENT DATES AND DIFFERENT TYPES
Table DOCUMENT:
| DOC_NUM | DOC_TYPE | DOC_DATE |
| | | |
| ACHAT190122001 | FACA | 22/01/2019 |
| ACHAT190222001 | FACA | 22/02/2019 |
| ACHAT190322001 | FACA | 22/03/2019 |
| BLCO190122001 | BLCO | 22/01/2019 |
| BLCO190123001 | BLCO | 23/01/2019 |
| BLCM190122001 | BLCM | 22/01/2019 |
| ACHAT190102010 | FACA | 02/01/2019 |
| ACHAT190103011 | FACA | 03/01/2019 |
| ACHAT190422005 | FACA | 22/04/2019 |
DOCUMENT TABLE
The second table contains as foreign key 'doc_num' the articles of each document 'art_code' and finally the prices of the articles 'art_prix'.
DETAILS OF EACH DOCUMENTS IN DOCUMENT TABLE WITH DIFFERENT AND SAME ARTICLES AND PRICES.
Table DOCUMENTD:
| DOC_NUM | ART_CODE |ART_PRIX |
| | | |
| ACHAT190122001 | ARTICLE1 | 1000 |
| ACHAT190122001 | ARTICLE2 | 2000 |
| ACHAT190102010 | ARTICLE1 | 950 |
| ACHAT190103011 | ARTICLE1 | 980 |
| ACHAT190422005 | ARTICLE2 | 1200 |
| ACHAT190120006 | ARTICLE2 | 1000 |
| BLCO190122001 | ARTICLE1 | 900 |
| BLCO190123001 | ARTICLE2 | 800 |
DOCUMENTD TABLE
My goal is to join the two tables using 'doc_num' selects all BLC type documents and their articles except the prices they must be THE LAST UPDATED PRICE IN FCAC TYPE FOR EXAMPLE
RESULT:
| BLCO190122001 | ARTICLE1 | 1000 | 22/01/2019 |
| BLCO190123001 | ARTICLE2 | 1200 | 22/04/2019 |
RESULT

As I see it, you want to join all the FACA type documents with all the BLC? documents on the ART_CODE column.
I'm certain it can be done with a single SQL query, but for me it's easier to do it as follows.
Create a view for all the FACA type documents...
create view FACAS as
select DOCUMENTD.DOC_NUM, DOCUMENTD.ART_CODE, DOCUMENTD.ART_PRIX, DOCUMENT.DOC_DATE
from DOCUMENTD join DOCUMENT on DOCUMENTD.DOC_NUM = DOCUMENT.DOC_NUM
where DOCUMENT.DOC_TYPE = 'FACA'
Create another view for all the BLC? type documents...
create view BLC_S as
select DOCUMENTD.DOC_NUM, DOCUMENTD.ART_CODE, DOCUMENTD.ART_PRIX, DOCUMENT.DOC_DATE
from DOCUMENTD join DOCUMENT on DOCUMENTD.DOC_NUM = DOCUMENT.DOC_NUM
where DOCUMENT.DOC_TYPE like 'BLC%'
Now query both views, joining on the ART_CODE column...
select BLC_S.DOC_NUM, BLC_S.ART_CODE, FACAS.ART_PRIX, BLC_S.DOC_DATE
from FACAS join BLC_S on FACAS.ART_CODE = BLC_S.ART_CODE

Here's one way:
DECLARE #document table (
DOC_NUM VARCHAR(MAX)
,DOC_TYPE VARCHAR(MAX)
,DOC_DATE VARCHAR(MAX)
)
INSERT INTO #document VALUES
('ACHAT190122001', 'FACA', '22/01/2019')
, ('ACHAT190222001', 'FACA', '22/02/2019')
, ('ACHAT190322001', 'FACA', '22/03/2019')
, ('BLCO190122001', 'BLCO', '22/01/2019')
, ('BLCO190123001', 'BLCO', '23/01/2019')
, ('BLCM190122001', 'BLCM', '22/01/2019')
DECLARE #documentd TABLE (
DOC_NUM VARCHAR(MAX)
,ART_CODE VARCHAR(MAX)
,ART_PRIX SMALLMONEY
)
INSERT INTO #documentd VALUES
('ACHAT190122001', 'ARTICLE1', 1000)
,('ACHAT190122001', 'ARTICLE2', 2000)
,('BLCO190122001', 'ARTICLE1', 900)
,('BLCO190123001', 'ARTICLE2', 800)
SELECT d1.DOC_NUM, dd1.ART_CODE, dd2.ART_PRIX, d2.DOC_DATE from #document d1
INNER JOIN #documentd dd1 ON dd1.DOC_NUM = d1.DOC_NUM
INNER JOIN #documentd dd2 ON dd2.ART_CODE = dd1.ART_CODE
INNER JOIN #document d2 ON d2.DOC_NUM = dd2.DOC_NUM AND d2.DOC_TYPE <> d1.DOC_TYPE
WHERE d1.DOC_TYPE = 'BLCO'
This returns:
DOC_NUM ART_CODE ART_PRIX DOC_DATE
BLCO190122001 ARTICLE1 1000.00 22/01/2019
BLCO190122001 ARTICLE2 2000.00 22/01/2019
From your example results, I assumed you wanted only the BLCO documents and not the BLCM. If you want both, just change the last line to be:
WHERE d.DOC_TYPE LIKE 'BLC%' AND d2.DOC_TYPE = 'FACA'

Related

TSQL Select max value associated with an ID from a table joined on a different ID?

I have 3 tables structured like this:
Shipment
+-------------+--------------------+
| Shipment_ID | Shipment_ID_Master |
+-------------+--------------------+
| 4767 | 4767 |
| 88359 | 28431 |
+-------------+--------------------+
Factory
+------------+-------------+
| Factory_ID | Shipment_ID |
+------------+-------------+
| 338161 | 4767 |
| 1178567 | 88359 |
| 1178568 | 88359 |
+------------+-------------+
Coverage
+------------+-----------+----------+
| Factory_ID | Public_ID | Revision |
+------------+-----------+----------+
| 338161 | 2354 | 2 |
| 1178567 | 32436 | 4 |
| 1178568 | 2354 | 3 |
+------------+-----------+----------+
I am trying to build a view that displays a row for only the max Public_ID associated with a Shipment_ID. The view should look like this:
+-------------+--------------------+------------+-----------+----------+
| Shipment_ID | Shipment_ID_Master | Factory_ID | Public_ID | Revision |
+-------------+--------------------+------------+-----------+----------+
| 4767 | 4767 | 338161 | 2354 | 2 |
| 88359 | 28431 | 1178567 | 32436 | 4 |
+-------------+--------------------+------------+-----------+----------+
I have a query that works to build this view, but it is too slow. When my application joins on this view the query is taking several minutes to finish execution. This is the query:
SELECT f.Shipment_ID,
s.Shipment_ID_Master,
f.Factory_ID,
c.Public_ID,
c.Revision
FROM Coverage c
JOIN Factory f ON c.Factory_ID = f.Factory_ID
JOIN Shipment s ON s.Shipment_ID = f.Shipment_ID
WHERE Public_ID = (
SELECT MAX(Public_ID)
FROM Coverage c2
JOIN Factory f2 ON c2.Factory_ID = f2.Factory_ID
WHERE f2.Shipment_ID = f.Shipment_ID
)
I think referencing this view is so slow because of the logic in the where clause. There must be a better and faster way to do this.
How can I select the maximum Public_ID associated with a Shipment_ID when the Shipment_ID is not stored on the same table as the Public_ID? Is it possible to do this without a where clause?
You can use Row_number to get the better performance over the given solution.
Try the following:
;WITH cte AS
(
SELECT s.Shipment_ID, s.Shipment_ID_Master, f.Factory_ID, c.Public_ID, c.Revision, row_number() OVER (PARTITION BY s.Shipment_ID ORDER BY c.Public_ID desc) AS rn
FROM #Shipment s
JOIN #Factory f ON f.Shipment_ID = s.Shipment_ID
JOIN #Coverage c ON c.Factory_ID = f.Factory_ID
)
SELECT c.Shipment_ID, c.Shipment_ID_Master, c.Factory_ID, c.Public_ID, c.Revision
FROM cte c WHERE rn = 1
Please see db<>fiddle here.

How can I summarize / pivot data with oracle sql

I have a table containing geological resource information.
| Property | Zone | Area | Category | Tonnage | Au_gt | Au_oz |
|----------|------|-------------|-----------|---------|-------|-------|
| Ket | Eel | Open Pit | Measured | 43400 | 5.52 | 7700 |
| Ket | Eel | Open Pit | Inferred | 51400 | 5.88 | 9700 |
| Ket | Eel | Open Pit | Indicated | 357300 | 6.41 | 73600 |
| Ket | Eel | Underground | Measured | 3300 | 7.16 | 800 |
| Ket | Eel | Underground | Inferred | 14700 | 6.16 | 2900 |
| Ket | Eel | Underground | Indicated | 168100 | 8.85 | 47800 |
I would like to summarize the data so that it can be read more easily by our clients.
| Property | Zone | Category | Open_Pit_Tonnage | Open_Pit_Au_gt | Open_Pit_Au_oz | Underground_tonnage | Underground_au_gt | Underground_au_oz | Combined_tonnage | Combined_au_gt | Combined_au_oz |
|----------|------|-----------|------------------|----------------|----------------|---------------------|-------------------|-------------------|------------------|----------------|----------------|
| Ket | Eel | Measured | 43,400 | 5.52 | 7,700 | 3,300 | 7.16 | 800 | 46,700 | 5.64 | 8,500 |
| Ket | Eel | Indicated | 357,300 | 6.41 | 73,600 | 168,100 | 8.85 | 47,800 | 525,400 | 7.19 | 121,400 |
| Ket | Eel | Inferred | 51,400 | 5.88 | 9,700 | 14,700 | 6.16 | 2,900 | 66,100 | 5.94 | 12,600 |
I'm fairly new to pivot tables. How could I write a query to translate and summarize the data?
Thanks!
If your Oracle version is 11.1 or higher (which it should be if you are a relatively new user!) then you can use the PIVOT operator, as shown below.
Note that the result of the PIVOT operation can be given an alias (I used p) - this makes it easier to write the SELECT clause.
I assumed the name of your table is geological_data - replace it with your actual table name.
select p.*
, open_pit_tonnage + underground_tonnage as combined_tonnage
, open_pit_au_gt + underground_au_gt as combined_au_gt
, open_pit_au_oz + underground_au_oz as combined_au_oz
from geological_data
pivot (sum(tonnage) as tonnage, sum(au_gt) as au_gt, sum(au_oz) as au_oz
for area in ('Open Pit' as open_pit, 'Underground' as underground)) p
;
Conditional aggregation is a simple method:
select Property, Zone, Category,
max(case when area = 'Open Pit' then tonnage end) as open_pit_tonnage,
max(case when area = 'Open Pit' then Au_gt end) as open_pit_Au_gt,
max(case when area = 'Open Pit' then Au_oz end) as open_pit_Au_ox,
max(case when area = 'Underground' then tonnage end) as Underground_tonnage,
max(case when area = 'Underground' then Au_gt end) as Underground_Au_gt,
max(case when area = 'Underground' then Au_oz end) as Underground_Au_ox
from t
group by Property, Zone, Category
SQL Server PIVOT operator is used to convert rows to columns.
Goal is to turn the category names from the first column of the output into multiple columns and count the number of products for each category
This query reference can be taken into account for you above table:
SELECT * FROM
(
SELECT
category_name,
product_id,
model_year
FROM
production.products p
INNER JOIN production.categories c
ON c.category_id = p.category_id
) t
PIVOT(
COUNT(product_id)
FOR category_name IN (
[Children Bicycles],
[Comfort Bicycles],
[Cruisers Bicycles],
[Cyclocross Bicycles],
[Electric Bikes],
[Mountain Bikes],
[Road Bikes])
) AS pivot_table;

When Querying Many-To-Many Relationship in SQL, Return Multiple Connections As an Array In Single Row?

Basically, I have 3 tables, titles, providers, and provider_titles.
Let's say they look like this:
| title_id | title_name |
|------------|----------------|
| 1 | San Andres |
| 2 |Human Centipede |
| 3 | Zoolander 2 |
| 4 | Hot Pursuit |
| provider_id| provider_name |
|------------|----------------|
| 1 | Hulu |
| 2 | Netflix |
| 3 | Amazon_Prime |
| 4 | HBO_GO |
| provider_id| title_id |
|------------|----------------|
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 1 |
| 3 | 3 |
| 4 | 4 |
So, clearly there are titles with multiple providers, yeah? Typical many-to-many so far.
So what I'm doing to query it is with a JOIN like the following:
SELECT * FROM provider_title JOIN provider ON provider_title.provider_id = provider.provider_id JOIN title ON title.title_id = provider_title.title_id WHERE provider.name IN ('Netflix', 'HBO_GO', 'Hulu', 'Amazon_Prime')
Ok, now to the actual issue. I don't want repeated title names back, but I do want all of the providers associated with the title. Let me explain with another table. Here is what I am getting back with the current query, as is:
| provider_id| provider_name | title_id | title_name |
|------------|---------------|----------|---------------|
| 1 | Hulu | 1|San Andreas |
| 1 | Hulu | 2|Human Centipede|
| 2 | Netflix | 1|San Andreas |
| 3 | Amazon_Prime | 1|San Andreas |
| 3 | Amazon_prime | 3|Zoolander 2 |
| 4 | HBO_GO | 4|Hot Pursuit |
But what I really want would be something more like
| provider_id| provider_name |title_id| title_name|
|------------|-----------------------------|--------|-----------|
| [1, 2, 3] |[Hulu, Netflix, Amazon_Prime]| 1|San Andreas|
Meaning I only want distinct titles back, but I still want each title's associated providers. Is this only possible to do post-sql query with logic iterating through the returned rows?
Depending on your database engine, there may be an aggregation function to help achieve this.
For example, this SQLfiddle demonstrates the postgres array_agg function:
SELECT t.title_id,
t.title_name,
array_agg( p.provider_id ),
array_agg( p.provider_name )
FROM provider_title as pt
JOIN
provider as p
ON pt.provider_id = p.provider_id
JOIN title as t
ON t.title_id = pt.title_id
GROUP BY t.title_id,
t.title_name
Other database engines have equivalents. For example:
mySQL has group_concat
Oracle has listagg
sqlite has group_concat (as well!)
If your database isn't covered by the above, you can google '[Your database engine] aggregate comma delimited string'

Four Table Join in BigQuery

Okay, so I'm trying to link together four different tables, and its getting very difficult. I provided snippets of each table in the hopes you all could help out
Table 1: data
+--------+--------+-----------+
| charge | amount | date |
+--------+--------+-----------+
| 123 | 10000 | 2/10/2016 |
| 456 | 10000 | 1/28/2016 |
| 789 | 10000 | 3/30/2016 |
+--------+--------+-----------+
Table 2: data_metadata
+--------+------------+------------+
| charge | key | value |
+--------+------------+------------+
| 123 | identifier | trrkfll212 |
| 456 | code | test |
| 789 | ID | 123xyz |
+--------+------------+------------+
Table 3: buyer
+-----+-----------+----------+----------+
| id | date | discount | plan |
+-----+-----------+----------+----------+
| ABC | 2/13/2016 | yes | option a |
| DEF | 2/1/2016 | yes | option a |
| GHI | 1/22/2016 | no | option a |
+-----+-----------+----------+----------+
Table 4: buyer_metadata
+--------------+-----------+--------+
| id | |key| | value |
+--------------+-----------+--------+
| ABC | migration | TRUE |
| DEF | emid | foo |
| GHI | ID | 123xyz |
+--------------+-----------+--------+
Okay, so the tables data and data_metadata are obviously connected by the charge column.
The tables buyer and buyer_metadata are connected by the id column.
But I want to link all of them together. I'm pretty sure the way to accomplish this is through linking the metadata tables together through the common field in the "value" column (in this example: 123xyz).
Could anyone help?
This might look like something like that if all "link" columns are unique :
SELECT *
FROM data d
JOIN data_metadata dm ON d.charge = dm.charge
JOIN buyer_metada bm ON dm.value = bm.value
JOIN buyer b ON bm.id = b.id
If not, I think you'll have to use something like GROUP BY clause
Let's take it in two steps, first create composite tables for data and buyer. Composite table for data:
SELECT data.charge, data.amount, data.date,
data_metadata.key, data_metadata.value
FROM [data] AS data
JOIN (SELECT charge, key, value FROM [data_metadata]) AS data_metadata
ON data.charge = data_metadata.charge
And composite table for buyer:
SELECT buyer.id, buyer.date, buyer.discount, buyer.plan,
buyer_metadata.key, buyer_metadata.value
FROM [buyer] AS buyer
JOIN (SELECT key, value FROM [buyer_metadata]) AS buyer_metadata
ON buyer.id = buyer_metadata.id
And then let's join the two composite tables
SELECT composite_data.*, composite_buyer.*
FROM (
SELECT data.charge, data.amount, data.date,
data_metadata.key, data_metadata.value
FROM [data] AS data
JOIN (SELECT charge, key, value FROM [data_metadata]) AS data_metadata
ON data.charge = data_metadata.charge) AS composite_data
JOIN (
SELECT buyer.id, buyer.date, buyer.discount, buyer.plan,
buyer_metadata.key, buyer_metadata.value
FROM [buyer] AS buyer
JOIN (SELECT key, value FROM [buyer_metadata]) AS buyer_metadata
ON buyer.id = buyer_metadata.id) AS composite_buyer
ON composite_data.value = composite_buyer.value
I haven't tested it but it's probably close.
For reference, here is the page on BigQuery JOINs. And have you seen this SO?

casting a REAL as INT and comparing

I am casting a real to an int and a float to an int and comparing the two like this:
where
cast(a.[SUM(PAID_AMT)] as int)!=cast(b.PAID_AMT as int)
but i am still getting results where the two are equal. for example:
+-----------+-----------+------------+------------+----------+
| accn | load_dt | pmtdt | sumpaidamt | Bpaidamt |
+-----------+-----------+------------+------------+----------+
| A133312 | 6/7/2011 | 11/28/2011 | 98.39 | 98.39 |
| A445070 | 6/2/2011 | 9/22/2011 | 204.93 | 204.93 |
| A465606 | 5/19/2011 | 10/19/2011 | 560.79 | 560.79 |
| A508742 | 7/12/2011 | 10/19/2011 | 279.65 | 279.65 |
| A567730 | 5/27/2011 | 10/24/2011 | 212.76 | 212.76 |
| A617277 | 7/12/2011 | 10/12/2011 | 322.02 | 322.02 |
| A626384 | 6/16/2011 | 10/21/2011 | 415.84 | 415.84 |
| AA0000044 | 5/12/2011 | 5/23/2011 | 197.38 | 197.38 |
+-----------+-----------+------------+------------+----------+
here is the full query:
select
a.accn,
a.load_dt,
a.pmtdt,
a.[SUM(PAID_AMT)] sumpaidamt,
sum(b.paid_amt) Bpaidamt
from
[MILLENNIUM_DW_DEV].[dbo].[Millennium_Payment_Data_May2011_July2012] a
join
F_PAYOR_PAYMENTS_DAILY b
on
a.accn=b.ACCESSION_ID
and
a.final_rpt_dt=b.FINAL_REPORT_DATE
and
a.load_dt=b.LOAD_DATE
and
a.pmtdt=b.PAYMENT_DATE
where
cast(a.[SUM(PAID_AMT)] as int)!=cast(b.PAID_AMT as int)
group by
a.accn,
a.load_dt,
a.pmtdt,
a.[SUM(PAID_AMT)]
what am i doing wrong? how do i return only records that are NOT equal?
I don't see why there is an issue.
The query is returning the sum of the payments in b (sum(b.paid_amt) Bpaidamt). The where clause is comparing individual payments. This just means that there is more than one payment.
Perhaps your intention is to have a HAVING clause instead:
having cast(a.[SUM(PAID_AMT)] as int)!=cast(sum(b.PAID_AMT) as int)
You can do a round and a cast statement.
cast(round(sumpaidamt,2) as money) <> cast(round(Bpaidamt,2) as money)
Sql Fiddle showing how it would work http://sqlfiddle.com/#!3/4eb79/1