How to combine multiple rows with JSON PATH in SQL Server? - sql

I have this dataset with 4 tables. I am trying to write the SQL query as following:
WITH test AS
(
SELECT
(f.name), f.id, f.domain, s.link,
(SELECT
name,
CASE
WHEN name IN (1, 3, 8) THEN 1
WHEN name IN (2, 6, 7) THEN 2
END AS [group]
FROM tags
WHERE corporate_statement_link_id = s.id
FOR JSON PATH) AS tags
FROM
fortune1000_companies f
LEFT JOIN
search_results s ON f.id = s.company_id
LEFT JOIN
corporate_statements c ON s.id = c.corporate_statement_link_id
WHERE
c.corporate_statement = 1
AND s.domain LIKE CONCAT('%', f.domain, '%')
)
SELECT name, link, tags
FROM test
but this produces the result where company names are duplicated because of the differences in link. For e.g., UnitedHeath Group (rows 4 & 5) is in two rows because the link is different. I want the result in such a way that the company name is shown just once, and tags are in the same group together. I don't need link to be shown; only included for this SO.

I think I figured it out.
This is what I did, and it gave me the answer I was looking for.
select name
, (STUFF((SELECT t.[name] from tags t
inner join search_results s
on s.id = t.corporate_statement_link_id
where f.id = s.company_id
FOR JSON PATH),1,2,'[{')) as ts
from [fortune1000_companies] f
where f.id between 1 and 101
I got the help from here

Related

Foreach loop on two SQL tables

I have two tables which are totally independent from each other, and I need to extract information from both of them and generate a CSV.
I'm doing this query:
SELECT NOM_FLUX, TYPE_CONTENU, DATE_DEPOT_GED
FROM FLUX_GED
WHERE TYPE_CONTENU = 'TEMPO_COURRIER_FSS'
AND NOM_FLUX NOT LIKE 'PCC%'
With this result:
Then I'm doing a query from this result with the ID
Like This (on the first result)
SELECT ID, URL_RELATIVE, TYPE_CONTENU, NOM_ELEMENT
FROM ELEMENT_GED
WHERE ID IN (
SELECT ID_ELEMENT
FROM SUIVI_GED
WHERE ID_FLUX IN (18682403)
)
With this result:
And here is the information from the SUIVI_GED table:
First I would like to do like a PowerShell foreach loop on every ID of my first query and then export the result of both query in a common csv.
I would like a result like that for my csv:
NOM_FLUX;URL_RELATIVE;TYPE_CONTENU;NOM_ELEMENT
infoNomFlux;infoURL;infoType;infoNOM
You seem to want join. A rather literal translation of your queries would be:
select
f.nom_flux,
f.type_contenu as type_contenu_flux,
f.date_depot_ged ,
e.id,
e.url_relative,
e.type_contenu as type_contenu_element,
e.nom_element
from flux_ged f
inner join element_ged e
on exists (select 1 from suivi_ged s where s.id_flux = f.id and e.id = s.id)
where
f.type_contenu = 'TEMPO_COURRIER_FSS'
and f.nom_flux not like 'PCC%'
Depending on your actual design, you might be able to flatten the exists condition as another join:
select
f.nom_flux,
f.type_contenu as type_contenu_flux,
f.date_depot_ged ,
e.id,
e.url_relative,
e.type_contenu as type_contenu_element,
e.nom_element
from flux_ged f
inner join suivi_ged s on s.id_flux = f.id
inner join element_ged e on e.id = s.id
where
f.type_contenu = 'TEMPO_COURRIER_FSS'
and f.nom_flux not like 'PCC%'

SQL query to retrieve last record from a linked table [duplicate]

This question already has answers here:
SQL join: selecting the last records in a one-to-many relationship
(13 answers)
Closed 6 years ago.
I wrote a query to compare 2 columns in different tables (TRELAY VS TUSERDEF8). The query works great, except that it retrieves the top record in the TUSERDEF8 table which has a many to one relationship to the TRELAY table.
The tables are linked by TRELAY.ID = TUSERDEF8.N01. I would like to retrieve the latest record from TUSERDEF8 and compare that record with the TRELAY record. I plan to use the max value of the index column (TUSERDEF8.ID) to determine the latest record.
I am using SQL Server.
My code is below, but I'm not sure how to change the query to retrieve the last TUSERDEF8 record. Any help is appreciated.
SELECT
TRELAY.ID, TRELAY.S15,
TUSERDEF8.S04, TUSERDEF8.N01, TUSERDEF8.S06
FROM
TRELAY
INNER JOIN
TUSERDEF8 ON TRELAY.ID = TUSERDEF8.N01
WHERE
LEFT(TRELAY.S15, 1) <> LEFT(TUSERDEF8.S04, 1)
AND NOT (TRELAY.S15 LIKE '%MEDIUM%' AND
TUSERDEF8.S04 LIKE '%N/A%' AND
TUSERDEF8.S06 LIKE '%EACMS%')
Making the assumption that your IDs are int(s) then the below might work?
SELECT TOP 1 TRELAY.ID, TRELAY.S15, TUSERDEF8.S04, TUSERDEF8.N01, TUSERDEF8.S06
FROM TRELAY INNER JOIN TUSERDEF8
ON TRELAY.ID = TUSERDEF8.N01
WHERE LEFT(TRELAY.S15, 1) <> LEFT(TUSERDEF8.S04, 1)
AND NOT (
TRELAY.S15 LIKE '%MEDIUM%'
AND TUSERDEF8.S04 LIKE '%N/A%'
AND TUSERDEF8.S06 LIKE '%EACMS%'
)
ORDER BY TUSERDEF8.ID DESC
HTH
Dave
You could do this:
With cteLastRecord As
(
Select S04, N01, S06,
Row_Number() Over (Partition By N01, Order By ID Desc) SortOrder
From TUSERDEF8
)
SELECT
TRELAY.ID, TRELAY.S15,
TUSERDEF8.S04, TUSERDEF8.N01, TUSERDEF8.S06
FROM
TRELAY
INNER JOIN
(Select S04, N01, S06 From cteLastRecord Where SortOrder = 1) TUSERDEF8 ON TRELAY.ID = TUSERDEF8.N01
WHERE
LEFT(TRELAY.S15, 1) <> LEFT(TUSERDEF8.S04, 1)
AND NOT (TRELAY.S15 LIKE '%MEDIUM%' AND
TUSERDEF8.S04 LIKE '%N/A%' AND
TUSERDEF8.S06 LIKE '%EACMS%')
I believe that your expected output is still a little ambiguous.
It sounds to me like you want only the record from the output where TUSERDEF8.ID is at its max. If that's correct, then try this:
SELECT TRELAY.ID, TRELAY.S15, TUSERDEF8.S04, TUSERDEF8.N01, TUSERDEF8.S06
FROM TRELAY
INNER JOIN TUSERDEF8 ON TRELAY.ID = TUSERDEF8.N01
WHERE LEFT(TRELAY.S15, 1) <> LEFT(TUSERDEF8.S04, 1)
AND NOT (TRELAY.S15 LIKE '%MEDIUM%' AND
TUSERDEF8.S04 LIKE '%N/A%' AND
TUSERDEF8.S06 LIKE '%EACMS%')
AND TUSERDEF8.ID IN (SELECT MAX(TUSERDEF8.ID) FROM TUSERDEF8)
EDIT: After reviewing your recent comments, it would seem something like this would be more suitable:
SELECT
, C.ID
, C.S15,
, D.S04
, D.N01
, D.S06
FROM (
SELECT A.ID, A.S15, MAX(B.ID) AS MaxID
FROM TRELAY AS A
INNER JOIN TUSERDEF8 AS B ON A.ID = B.N01
WHERE
LEFT(A.S15, 1) <> LEFT(B.S04, 1)
AND NOT (A.S15 LIKE '%MEDIUM%' AND
B.S04 LIKE '%N/A%' AND
B.S06 LIKE '%EACMS%')
GROUP BY A.ID, A.S15
) AS C
INNER JOIN TUSERDEF8 AS D ON C.ID = D.N01 AND C.MaxID = D.ID
Using an ID column to determine which row is "last" is a bad idea
Using cryptic table names like "TUSERDEF8" (how is it different from TUSERDEF7) is a very bad idea, along with completely cryptic column names like "S04".
Using prefixes like "T" for table is a bad idea - it should already be clear that it's a table.
Now that all of that is out of the way:
SELECT
R.ID,
R.S15,
U.S04,
U.N01,
U.S06
FROM
TRELAY R
INNER JOIN TUSERDEF8 U ON U.N01 = R.ID
LEFT OUTER JOIN TUSERDEF8 U2 ON
U2.N01 = R.ID AND
U2.ID > U.ID
WHERE
U2.ID IS NULL AND -- This will only happen if the LEFT OUTER JOIN above found no match, meaning that the row in U has the highest ID value of all matches
LEFT(R.S15, 1) <> LEFT(U.S04, 1) AND
NOT (
R.S15 LIKE '%MEDIUM%' AND
U.S04 LIKE '%N/A%' AND
U.S06 LIKE '%EACMS%'
)

SQL - Select records not present in another table (3 table relation)

I have 3 tables:
Table_Cars
-id_car
-description
Table_CarDocuments
-id_car
-id_documentType
-path_to_document
Table_DocumentTypes
-id_documentType
-description
I want to select all cars that do NOT have documents on the table Table_CarDocuments with 4 specific id_documentType.
Something like this:
Car1 | TaxDocument
Car1 | KeyDocument
Car2 | TaxDocument
With this i know that i'm missing 2 documents of car1 and 1 document of car2.
You are looking for missing car documents. So cross join cars and document types and look for combinations NOT IN the car douments table.
select c.description as car, dt.description as doctype
from table_cars c
cross join table_documenttypes dt
where (c.id_car, dt.id_documenttype) not in
(
select cd.id_car, cd.id_documenttype
from table_cardocuments cd
);
UPDATE: It shows that SQL Server's IN clause is very limited and not capable of dealing with value lists. But a NOT IN clause can easily be replaced by NOT EXISTS:
select c.description as car, dt.description as doctype
from table_cars c
cross join table_documenttypes dt
where not exists
(
select *
from table_cardocuments cd
where cd.id_car = c.id_car
and cd.id_documenttype = dt.id_documenttype
);
UPDATE: As you are only interested in particular id_documenttype (for which you'd have to add and dt.id_documenttype in (1, 2, 3, 4) to the query), you can generate records for them on-the-fly instead of having to read the table_documenttypes.
In order to do that replace
cross join table_documenttypes dt
with
cross join (values (1), (2), (3), (4)) as dt(id_documentType)
You can use the query below to get the result:
SELECT
c.description,
dt.description
FROM
Table_Cars c
JOIN Table_CarDocuments cd ON c.id_car = cd.id_car
JOIN Table_DocumentTypes dt ON cd.id_documentType = dt.id_documentType
WHERE
dt.id_documentType NOT IN (1, 2, 3, 4) --replace with your document type id
Thanks to #Thorsten Kettner help
select c.description as car, dt.description as doctype
from table_cars c
cross join table_documenttypes dt
where dt.id no in (
(
select cd.id_documentType
from table_cardocuments cd
where cd.idcar = c.id AND cd.id_doctype = dt.id
)
AND dt.id IN (1, 2, 3, 4)
This can be a complicated query. The idea is to generate all combinations of cars and the four documents that you want (using cross join). Then use left join to determine if the document actually exists:
select c.id_car, dd.doctype
from cars c cross join
(select 'doc1' as doctype union all
select 'doc2' union all
select 'doc3' union all
select 'doc4'
) dd left join
CarDocuments cd
on c.id_car = cd.id_car left join
Documents d
on cd.id_document_type = d.id_document_type and d.doctype = dd.doctype
where dd.id_document_type is null;
Finally, the where clause finds the car/doctype pairs that are not present in the data.

SQL union query to group by a field and project into flattened result set

I am attempting to write a query which will retrieve records from the same table with different conditions and present the results in a flattened format as below:
Desired Result
ID, Word, Translation_From_Region_1007, Translation_From_Region_1006
1, Word1, Test 1, Test 2
2, Word2, Test 3, Test 4
The psuedo-code for my query is below, however I'm not entirely sure how to flatten out the results to display my desired result:
SELECT Words.ID, Words.Word, Translation
FROM Words WHERE RegionId=1007
UNION
SELECT Words.ID, Words.Word, Translation
FROM Words WHERE RegionId=1006
Group by Word (as I only want one instance of the word itself with its respective translations flattened.
If anybody can give me any advice or suggest a better way to do this, I'd be very grateful.
How about this?
select word, max(Translation1006), max(Translation1007)
from
(SELECT
words.word,
Translation1006 =
CASE region
when 1006 THEN trans
else NULL
END,
Translation1007 =
CASE region
when 1007 THEN trans
else NULL
END
FROM
words) as detail
group by word
How about something like this?
select A.ID, A.Word, Trans1007 = A.Translation, Trans1006 = B.Translation
from WORDS A
left outer join WORDS B on A.ID = B.ID and B.RegionId = 1006
where A.RegionId = 1007
union
select B.ID, B.Word, Trans1007 = A.Translation, Trans1006 = B.Translation
from WORDS B
left outer join WORDS A on B.ID = A.ID and A.RegionId = 1007
where B.RegionId = 1006
or you can pivot similar to this (which will be better if you have more than just two regions you would like to query on)...
select ID, Word, [1006] as T_1006, [1007] as T_1007
from (select Id, Word, RegionId, Translation from WORD where RegionId in (1006, 1007)) w
pivot (
max(Translation)
for RegionId in([1006], [1007])
) as pvt
If you are using SQL Server, you could in effect flatten the translation putting a comma between each translation entry like so.
SELECT Main.ID, Main.Word, Main.Translations
FROM(SELECT distinct Words2.ID, Words2.Word,
(SELECT Words1.Translation + ',' AS [text()]
FROM WORDS Words1
WHERE Words1.ID = Words2.ID
AND Words1.RegionId in (1007, 1006)
ORDER BY Words1.ID
For XML PATH ('')) [Translations]
FROM WORDS Words2) [Main]
Another simple example of this can be found via this stack overflow question:
Concatenate many rows into a single text string?
Alternatively, you can find numerous examples of doing this in Oracle here:
http://www.dba-oracle.com/t_display_multiple_column_values_same_rows.htm
SELECT distinct Words.ID, Words.Word, Translation
FROM Words
WHERE RegionId=1007 or RegionId=1006

MySql scoping problem with correlated subqueries

I'm having this Mysql query, It works:
SELECT
nom
,prenom
,(SELECT GROUP_CONCAT(category_en) FROM
(SELECT DISTINCT category_en FROM categories c WHERE id IN
(SELECT DISTINCT category_id FROM m3allems_to_categories m2c WHERE m3allem_id = 37)
) cS
) categories
,(SELECT GROUP_CONCAT(area_en) FROM
(SELECT DISTINCT area_en FROM areas c WHERE id IN
(SELECT DISTINCT area_id FROM m3allems_to_areas m2a WHERE m3allem_id = 37)
) aSq
) areas
FROM m3allems m
WHERE m.id = 37
The result is:
nom prenom categories areas
Man Multi Carpentry,Paint,Walls Beirut,Baalbak,Saida
It works correclty, but only when i hardcode into the query the id that I want (37).
I want it to work for all entries in the m3allem table, so I try this:
SELECT
nom
,prenom
,(SELECT GROUP_CONCAT(category_en) FROM
(SELECT DISTINCT category_en FROM categories c WHERE id IN
(SELECT DISTINCT category_id FROM m3allems_to_categories m2c WHERE m3allem_id = m.id)
) cS
) categories
,(SELECT GROUP_CONCAT(area_en) FROM
(SELECT DISTINCT area_en FROM areas c WHERE id IN
(SELECT DISTINCT area_id FROM m3allems_to_areas m2a WHERE m3allem_id = m.id)
) aSq
) areas
FROM m3allems m
And I get an error:
Unknown column 'm.id' in 'where
clause'
Why?
From the MySql manual:
13.2.8.7. Correlated Subqueries
[...]
Scoping rule: MySQL evaluates from inside to outside.
So... do this not work when the subquery is in a SELECT section? I did not read anything about that.
Does anyone know? What should I do? It took me a long time to build this query... I know it's a monster query but it gets what I want in a single query, and I am so close to getting it to work!
Can anyone help?
You can only correlate one level deep.
Use:
SELECT m.nom,
m.prenom,
x.categories,
y.areas
FROM m3allens m
LEFT JOIN (SELECT m2c.m3allem_id,
GROUP_CONCAT(DISTINCT c.category_en) AS categories
FROM CATEGORIES c
JOIN m3allems_to_categories m2c ON m2c.category_id = c.id
GROUP BY m2c.m3allem_id) x ON x.m3allem_id = m.id
LEFT JOIN (SELECT m2a.m3allem_id,
GROUP_CONCAT(DISTINCT a.area_en) AS areas
FROM AREAS a
JOIN m3allems_to_areas m2a ON m2a.area_id = a.id
GROUP BY m2a.m3allem_id) y ON y.m3allem_id = m.id
WHERE m.id = ?
The reason for the error is that in the subquery m is not defined. It is defined later in the outer query.