I have a SQL query problem with the following abstract sample context: There are 2 different input data for my sql query defined as ''MainElement'' with key 123 for the one and 789 for the other main element.
Further I have a table called Relation with columns pk, FirstElement, SecondElement and ThirdElement.
Furthermore there is a table called Props with the columns pk, name and valueString. The special feature about this context is that column name in Props defines 2 further columns called 4thElement and 5thElementof table Relation as a row with its values in column valueString .
|pk | 1stElement | 2ndElement | 3rdElement |
|abc|-----123----|-----456----|-----null---|
|def|-----789----|-----101112-|---131415---|
|Pk | Name | ValueString |
|def|4thElement|161718---|
|def|5thElement|ghi------|
As you can see the MainElement 789 has a value for 4thElement and 5thElement in Props, but MainElement 123 hasn't any value in Props.
What I need is an universal SQL query with input value 1stElement e.g., 123 or 789 that returns me a result for both main elements independent of the fact that MainElement 123 hasn't any value in Props.
Sample result:
|1stElement | 2ndElement | 3rdElement | 4thElement | 5thElement |
|123--------|------456---|-----null---|---null-----|----null----|
|789--------|----101112--|---131415---|---161718---|----ghi-----|
I am using Oracle SQL Developer.
Select
rel.1stElement,
....
From
Relation rel,
Props pro,
Where
?
Thanks in advance.
This should do the work, this is typically a pivot query that you need:
SELECT rel.pk, rel.1stElement, rel.2ndElement, rel.3rdElement
, MAX(CASE WHEN pro.Name = '4thElement'
THEN pro.ValueString
ELSE NULL
END) as 4thElement
, MAX(CASE WHEN pro.Name = '5thElement'
THEN pro.ValueString
ELSE NULL
END) as 5thElement
FROM Relation rel
LEFT OUTER JOIN Props pro
ON rel.pk = pro.pk
GROUP BY rel.pk, rel.1stElement, rel.2ndElement, rel.3rdElement
Related
I've got an issue when trying to visualize in Google Data Studio some information from a denormalized table.
Context: I want to gather all the contact of a company and there related orders in a table in Big Query. Contacts can have no order or multiple orders. Following Big Query best practice, this table is denormalized and all the orders for a client are in arrays of struct. It looks like this:
Fields Examples:
+-------+------------+-------------+-----------+
| Row # | Contact_Id | Orders.date | Orders.id |
+-------+------------+-------------+-----------+
|- 1 | 23 | 2019-02-05 | CB1 |
| | | 2020-03-02 | CB293 |
|- 2 | 2321 | - | - |
|- 3 | 77 | 2010-09-03 | AX3 |
+-------+------------+-------------+-----------+
The issue is when I want to use this table as a data source in Data Studio.
For instance, if I build a table with Contact_Id as dimension, everything is fine and I can see all my contacts. However, if I add any dimensions from the Orders struct, all info from contact with no orders are not displayed. For instance, all info from Contact_Id 2321 is removed from the table.
Have you find any workaround to visualize these empty arrays (for instance as null values)?
The only solution I've found is to build an intermediary table with the orders unnested.
The way I've just discovered to work around this is to add an extra field in my DS-> BQ connector:
ARRAY_LENGTH(fields.orders) AS numberoforders
This will return zero if the array is empty - you can then create calculated fields within DataStudio - using the "numberoforders" field to force values to NULL or zero.
You can fix this behaviour by changing a little your query on the BigQuery connector.
Instead of doing this:
SELECT
Contact_id,
Orders
FROM myproject.mydataset.mytable
try this:
SELECT
Contact_id,
IF(ARRAY_LENGTH(Orders) > 0, Orders, [STRUCT(CAST(NULL AS DATE) AS date, CAST(NULL AS STRING) AS id)]) AS Orders
FROM myproject.mydataset.mytable
This way you are forcing your repeated field to have, at least, an array containing NULL values and hence Data Studio will represent those missing values.
Also, if you want to create new calculated fields using one of the nested fields, you should check before if the value is NULL to avoid filling all NULL values. For example, if you have a repeated and nested field which can be 1 or 0, and you want to create a calculated field swaping the value, you should do:
IF(myfield.key IS NOT NULL, IF(myfield.key = 1, 0, 1), NULL)
Here you can see what happens if you check before swaping and if you don't:
Original value No check Check
1 0 0
0 1 1
NULL 1 NULL
1 0 0
NULL 1 NULL
I want to import data from one sql database to another. The database containing the data is structured differently than the one I have now.
My database has the tables Person and Person_Data
Person columns:
id(PK, int) | Person_Name(text)| Person_Data_id(FK, int)
Person_Data columns:
Person_Data_id(PK, int) | Date_Of_Birth(text)| City_Of_Birth(text) | Favorite_City(text)|
The other database has the neccesary data to populate this, but is structured a bit differently. It has these tables:
ExternalPerson, ExternalProperty
ExternalPerson columns:
|PersonID(PK, int) | Name(string) |
| 0 |"John" |
| 1 |"Bob" |
ExternalProperty columns:
|PersonId|PropertyName|PropertyAttribute|PropertyValue|
| 0 |"Birth" | "City" |"Rome" |
| 1 |"Birth" | "City" |"Vienna" |
| 0 |"Birth" | "Date" |"1982-02-01" |
| 0 |"Favorite" | "City" |"New York" |
As you can see, the external database contains information that could be inserted in the regular one. It's just that some of the columns are stored in rows instead. I want to merge it, so that, for each PersonID, we pick up the Value for Birth and City and put it in City_Of_Birth etc. The external database is structured so that each combination of PersonID, PropertName and PropertyAttribute only has one row, so there is no risk for disambiguity. All combinations of PropertyName and PropertyAttribute present in the external database also have a correcponding column in the Person_Data table. There might be missing data though, for example in our case, Bob does not have a value for date of Birth or Favorite city, in which case those entries should be null. That is, I want to transform the two tables ExternalPerson and ExternalProperty into
|id(PK, int)|Name |Date_Of_Birth|City_Of_Birth|Favorite_City|
|auto |"John"|"1982-02-01" |"Rome" |"New York" |
|increment |"Bob" | NULL |"Vienna" |NULL |
I have tried various combinations of JOIN, GROUP BY, SELECT CASE WHEN and COALESCE to no avail. I feel like this should be possible to do, but have not succeded to find the SQL commands to extract the rows from the external database into columns. For example the line
SELECT
Name,
PropertValue AS City_Of_Birth
FROM
ExternalProperty
WHERE PropertyName LIKE 'Birth' AND PropertyAttribute LIKE 'City'
will output the City_Of_Birth in a single column together with Name, but I don't know how to aggregate the result.
Does anybody have any idea on how to do this? Thanks in advance.
I am using Microsoft SQL Server Management Studio 2017 and Microsoft SQL Server 2017 (RTM) - 14.0.1000.169 (X64)
You can aggregate with MAX()
MAX(CASE WHEN PropertyName LIKE 'Birth' AND PropertyAttribute LIKE 'City' THEN PropertyValue ELSE NULL END) AS City_Of_Birth
I'm working on a filter where the user can choose different conditions for the end output. Right now I'm doing the construction of the SQL query, but whenever more conditions are selected, it doesn't work.
Example of the advalues table.
+----+-----------+---------------+------------+
| id | listingId | value | identifier |
+----+-----------+---------------+------------+
| 1 | 1a | Alaskan Husky | race |
+----+-----------+---------------+------------+
| 2 | 1a | Højt | activity |
+----+-----------+---------------+------------+
| 3 | 1c | Akita | race |
+----+-----------+---------------+------------+
| 4 | 1c | Mellem | activity |
+----+-----------+---------------+------------+
As you can see, there's a different row for each advalue.
The outcome I expect
Let's say the user has checked/ticked the checkbox for the race where it says "Alaskan Husky", then it should return the listingId for the match (once). If the user has selected both "Alaskan Husky" and activity level to "Low" then it should return nothing, if the activity level is either "Mellem" or "Højt" (medium, high), then it should return the listingId for where the race is "Alaskan Husky" only, not "Akita". I hope you understand what I'm trying to accomplish.
I tried something like this, which returns nothing.
SELECT * FROM advalues WHERE (identifier="activity" AND value IN("Mellem","Højt")) AND (identifier="race" AND value IN("Alaskan Husky"))
By the way, I want to select distinct listingId as well, so it only returns unique listingId's.
I will continue to search around for solutions, which I've been doing for the past few hours, but wanted to post here too, since I haven't been able to find anything that helped me yet. Thanks!
You can split the restictions on identifier in two tables for each type. Then you join on listingid to obtain the listingId wich have the two type of identifier.
SELECT ad.listingId
FROM advalues ad
JOIN advalues ad2
ON ad.listingId = ad2.listingId
WHERE ( ad.identifier = 'activity' AND ad.value IN( 'Mellem', 'Højt' ) )
AND ( ad2.identifier = 'race' AND ad2.value IN( 'Alaskan Husky' ) )
The question isn't exactly clear, but I think you want this:
WHERE (identifier="activity" AND value IN("Mellem","Højt")) OR (identifier="race" AND value IN("Alaskan Husky"))
If I got you right you are trying to fetch data with different "filters".
Your Query
SELECT listingId FROM advalues
WHERE identifier="activity"
AND value IN("Mellem","Højt")
AND identifier="race"
AND value IN("Alaskan Husky")
Will always return 0 results as you are asking for identifier = "activity" AND identifier = "race"
I think you wanted to do something like this instead:
SELECT listingId FROM advalues
WHERE
(identifier="activity" AND value IN("Mellem","Højt"))
OR
(identifier="race" AND value IN("Alaskan Husky"))
This query is a subset of a large query, where I'm OUTER APPLY'ing a bunch of values, to filter out results later
I've got some data:
Table: Items
ID | Material | Form
----------------------------------
1 | Aluminium | Sheets
------------------------------
1 | Carbon Steel | Bars
------------------------------
2 | Aluminium | Bars
I want to find the matching IDs, that satisfy a given input. The input can be in one of three forms, and can have one or many rows. When an input has multiple rows, the item must satisfy ALL rows. Examples of input are given below:
#Input type 1: (just a material, one or multiple allowed)
Material | Form
-------------------
Aluminium | NULL
#Input type 2: (material and a form, one or multiple allowed)
Material | Form
-------------------
Aluminium | Sheets
#Input type 3: (one or more material and form, with one or more materials)
Material | Form
-------------------
Aluminium | Sheets
Carbon Steel | NULL
I've written a query that can handle input type 1 and a query for input type 2, but I need to combine them, and be able to handle input type 3.
Query for Input Type 1:
Select *
From table
OUTER APPLY(
SELECT top(1) i.Material
FROM #Input i --Input type 1
WHERE i.Material NOT IN
(SELECT items.Material
FROM Items
WHERE items.id = table.id)
)MaterailCondition
--this makes sure that there isn't anything selected that does not match Material
WHERE MaterialCondition.Material IS NULL
Query for Input Type 2:
Select *
From table
OUTER APPLY(
SELECT top(1) i.Material, i.Form
FROM #Input i --Input type 1
WHERE i.Material NOT EXISTS
(SELECT *
FROM Items
WHERE items.id = table.id
AND items.Material = i.Material
AND items.Form = i.Form)
)MaterailCondition
--this makes sure that there isn't anything selected that does not match Material
WHERE MaterialCondition.Form IS NULL
Again, at this point, I need to be able to
Combine the queries into the same outer apply block
Accomodate Input Type 3
Any help would be greatly appreciated! Also, if I can explain anything, or be any clearer about any aspect of this, please let me know. I tried to keep it as short and focused as possible.
EDIT
Here would be the desired output from the query
ID | Name | MaterialCondition.Material
-------------------------------------------
23 | Some Item | (any text, such as 'Carbon Steel') <-- This is not a match
12 | Other Item | NULL <-- This IS a match
--(the where clause will filter these out, by saying)
WHERE MaterialCondition.Material IS NULL
So just ID number 12 is returned:
ID | Name | MaterialCondition.Material
-------------------------------------------
12 | Other Item | NULL
So far I've gotten to a state that functions, like this:
Select *
From table
OUTER APPLY(
SELECT top(1) i.Material
FROM #Input i --Input type 1
WHERE i.Material NOT IN
(SELECT items.Material
FROM Items
WHERE items.id = table.id)
)MaterailCondition
OUTER APPLY(
SELECT top(1) i.Material, i.Form
FROM #Input i --Input type 1
WHERE i.Material NOT EXISTS
(SELECT *
FROM Items
WHERE items.id = table.id
AND items.Material = i.Material
AND items.Form = i.Form)
)MaterailCondition2
--this makes sure that there isn't anything selected that does not match Material
WHERE MaterialCondition.Material IS NULL AND MaterialCondition2.Form IS NULL
This will work properly, and I've got an outer apply for input type 1 and input type 2, then the outer applys will take care of their respective parts of input type 3. I guess I was just hoping to contain this logic inside of one OUTER APPLY
I'm very new to SQL and I hope someone can help me with some SQL syntax. I have a database with these tables and fields,
DATA: data_id, person_id, attribute_id, date, value
PERSONS: person_id, parent_id, name
ATTRIBUTES: attribute_id, attribute_type
attribute_type can be "Height" or "Weight"
Question 1
Give a person's "Name", I would like to return a table of "Weight" measurements for each children. Ie: if John has 3 children names Alice, Bob and Carol, then I want a table like this
| date | Alice | Bob | Carol |
I know how to get a long list of children's weights like this:
select d.date,
d.value
from data d,
persons child,
persons parent,
attributes a
where parent.name='John'
and child.parent_id = parent.person_id
and d.attribute_id = a.attribute_id
and a.attribute_type = "Weight';
but I don't know how to create a new table that looks like:
| date | Child 1 name | Child 2 name | ... | Child N name |
Question 2
Also, I would like to select the attributes to be between a certain range.
Question 3
What happens if the dates are not consistent across the children? For example, suppose Alice is 3 years older than Bob, then there's no data for Bob during the first 3 years of Alice's life. How does the database handle this if we request all the data?
1) It might not be so easy. MS SQL Server can PIVOT a table on an axis, but dumping the resultset to an array and sorting there (assuming this is tied to some sort of program) might be the simpler way right now if you're new to SQL.
If you can manage to do it in SQL it still won't be enough info to create a new table, just return the data you'd use to fill it in, so some sort of external manipulation will probably be required. But you can probably just use INSERT INTO [new table] SELECT [...] to fill that new table from your select query, at least.
2) You can join on attributes for each unique attribute:
SELECT [...] FROM data AS d
JOIN persons AS p ON d.person_id = p.person_id
JOIN attributes AS weight ON p.attribute_id = weight.attribute_id
HAVING weight.attribute_type = 'Weight'
JOIN attributes AS height ON p.attribute_id = height.attribute_id
HAVING height.attribute_type = 'Height'
[...]
(The way you're joining in the original query is just shorthand for [INNER] JOIN .. ON, same thing except you'll need the HAVING clause in there)
3) It depends on the type of JOIN you use to match parent/child relationships, and any dates you're filtering on in the WHERE, if I'm reading that right (entirely possible I'm not). I'm not sure quite what you're looking for, or what kind of database you're using, so no good answer. If you're new enough to SQL that you don't know the different kinds of JOINs and what they can do, it's very worthwhile to learn them - they put the R in RDBMS.
when you do a select, you need to specify the exact columns you want. In other words you can't return the Nth child's name. Ie this isn't possible:
1/2/2010 | Child_1_name | Child_2_name | Child_3_name
1/3/2010 | Child_1_name
1/4/2010 | Child_1_name | Child_2_name
Each record needs to have the same amount of columns. So you might be able to make a select that does this:
1/2/2010 | Child_1_name
1/2/2010 | Child_2_name
1/2/2010 | Child_3_name
1/3/2010 | Child_1_name
1/4/2010 | Child_1_name
1/4/2010 | Child_2_name
And then in a report remap it to how you want it displayed