Update x1 a set a.dept_cd=(select distinct dept_cd from x2 b a.nm=b.nm)
It's my sql
Distinct make data unique, but it result in an error message,
row subquery returns more than one row
My data is string
So i use name to return code(dept_cd)
Can you help me?
If this query return that error, it means that you have more than one dept_cd where nm is equal to the one you are looking for.
The goal of distinct is to avoid having twice the same value of dept_cd.
If you need one the first one no matter what the value is, you can add limit 0,1 ad the end of your subquery.
If the value you need is a specific one, you need to find a way to update your query to isolate it but without having the full context, we cannot help you on that.
Related
I have two tables I am left joining together. The first tables has transnational level detail, causing the key I join to the second table to duplicate. When I left join the second table, the measure "company_spend" is highly inflated.
I need a way to keep only a single value of the duplicated data, and my thought was to run a distinct function on only those columns, but I am not seeing that Bigquery supports distinct functions on only a few columns, but not all.
SELECT UPPER(cwnextt.Current_Contract_Number) AS Current_Contract_Number,
UPPER(cwnextt.Replacement_Contract_Number) AS Replacement_Contract_Number,
UPPER(cwnextt.Current_Contract_Name) AS Current_Contract_Name,
UPPER(cwnextt.Supplier_Top_Parent_Entity_Code) AS Supplier_Top_Parent_Entity_Code,
UPPER(cwnextt.Supplier_Top_Parent_Name) AS Supplier_Top_Parent_Name,
UPPER(cwnextt.company_Entity_Code) AS company_Entity_Code,
UPPER(cwnextt.Facility_Name) AS Facility_Name,
smart.company_Spend AS companySpend
FROM `test_etl_field.contracts_with_member_entity_codes_test_view_2` cwnextt
--this table is what is causing the below table to duplicate,
--but I need all of this data AS well in its current format.
LEFT JOIN `test.trans_analysis` tsa
ON TRIM(UPPER(cwnextt.company_entity_code)) = TRIM(UPPER(tsa.company_entity_code))
AND TRIM(UPPER(cwnextt.Supplier_Top_Parent_Entity_Code)) = TRIM(UPPER(tsa.manufacturer_top_parent_entity_code))
AND TRIM(UPPER(cwnextt.Current_Contract_Name)) = TRIM(UPPER(tsa.contract_category))
AND cwnextt.spend_period_yyyyqmm = tsa.spend_period_yyyyqmm
--this table contains "company_spend" which is now duplicated
LEFT JOIN `test_etl_field.ecr_smart_data` smart
ON smart.company_entity_code = cwnextt.company_entity_code
AND (smart.contract_number = cwnextt.current_contract_number
OR smart.contract_number = cwnextt.replacement_contract_number)
AND smart.month_key = cwnextt.spend_period_yyyyqmm
If something can be created that will keep company_spend from duplicating on the second left join, that is what I am after.
Not sure to understand all the details of your problem but here's a fact from BigQuery doc :
SELECT DISTINCT
A SELECT DISTINCT statement discards duplicate rows
and returns only the remaining rows.
You can't apply DISTINCT on specific columns because it doesn't make sense. Let's say you have 4 columns and call DISTINCT on 3 columns, what is SQL supposed to do with the last one ?
You must tell SQL which value to keep for the remaining column and GROUP BY is the right solution here.
So if you want to:
Remove a column that has been duplicated : Just adjust your SELECT to get only the columns you want
Remove lines that have the same value in specific columns : I would suggest a GROUP BY on the targeted column and taking the aggregation you want (first, avg, sum or whatever) for the remaining ones.
Remove the value from a row if another row has the same : You may not want to do that. A row has to keep its value and you won't get it back. Besides, same problem, which row do you want to keep ?
Hope this helps ! Feel free to give clarification on your problem if you want more specific answers.
While I couldn't resolve this issue in SQL, I used Tableau via a FIXED LOD to aggregate the data passed duplicates so the end user could visualize the output with accuracy. Not ideal, but the SQL route wasn't make sense.
I have a stored procedure which is doing the following.
The populated target table data is checked against several similar source tables for a match (based on name and address data). If a match is found in the first table then it updates the target with a flag identifying which source table the match was from. However if it doesn't find a match I need it to look in the next source table and the next until either a match is found or not as the case may be.
Is there an easy way for the UPDATE statement to provide some kind of return value I can query to say whether it updated the target table? I would like to use this return value so that I can skip checking subsequent source tables unnecessarily.
Otherwise will I have to perform the conditional UPDATE then do a separate query to determine if the UPDATE actually updated the flag?
Probably the safest approach is to use the OUTPUT clause. This will return the modified rows into a new table.
You can check the table to see if any rows have been updated.
One advantage of the OUTPUT clause is that you can update multiple rows at the same time.
I like the soulution of Gordon, but I do not think you actualy need it.
Simply run the updates in order:
UPDATE BASE_TABLE
SET FLAG='first_table'
where FLAG IS null AND
EXIST (SELECT 1 FROM first_table f1 where f1.ID = ID)
UPDATE BASE_TABLE
SET FLAG='second_table'
where FLAG IS null AND
EXIST (SELECT 1 FROM second_table f2 where f2.ID = ID)
...
And so on.
You dont need to check every row conditionaly, that would be very slow.
you can put your update in try/catch and insert your result to another table
I was practicing a subqueries in sql and all of a sudden i jumped into an unsual query which i never thought of could happen.
The question of my query is....
Write a query to display the average rate of Australian dollar,where the currency rate date is July 1 2005??
And the query was...
USE AdventureWorks2012
SELECT AverageRate FROM Sales.CurrencyRate
WHERE ToCurrencyCode='AUD' AND CurrencyRateDate IN
(SELECT CurrencyRateDate FROM Sales.Currency
WHERE CurrencyRateDate='2005-07-01')
So,my question is how is it possible to get the column name "CurrencyRateDate" in the sub query when it is actually from the table "CurrencyRate"??
I know my query is not in the correct format as it should be.
I'm extremely sorry if my title doesn't make sense.If you guys can give any better please change it..
Thanks
AND CurrencyRateDate IN
(SELECT CurrencyRateDate FROM Sales.Currency
WHERE CurrencyRateDate='2005-07-01')
All the CurrencyRateDate references here point to the column from the outer query.
So for each row in the outer query, you are getting a list consisting of only that row's CurrencyRateDate, repeated once for every row in the Sales.Currency table (if the CurrencyRateDate of that row is 2005-07-01, otherwise the list is empty).
Then you check whether the outer CurrencyRateDate value is in that list. Which it is, if and only if it's equal to 2005-07-01 (assuming there is at least one row in Sales.Currency).
So your query is equivalent to:
SELECT * FROM Sales.CurrencyRate
WHERE ToCurrencyCode='AUD' AND CurrencyRateDate='2005-07-01'
I'm using Postgres and I'd like to know how to change row information within a query, Let's say I have a column called Numbers and it's got rows going 1,2,3,4,5 how could I edit the information in those rows? let's say I want the query to display 1,1,1,1,5 how would I write in a query that each row should be changed to 1 unless it's 5? Again it's only to change it within the Query, I'm not trying to do an UPDATE I realize how newbish this is on my part but I couldn't find this on google.
SELECT
CASE WHEN Numbers <> 5 THEN 1 ELSE Numbers END
FROM table
See 9.12. Conditional Expressions
Select Null as Empty from (select * from TblMetaData)
Looks like, it is trying to get null rows for the same number of rows in tblMetaData.
EDIT: This could be written as
SELECT Null AS Empty FROM tblMetaData
It will yield a result set with one column named Empty which only contains NULL values. The number of rows will be equal to the number of rows available in TblMetaData.
It looks like the result of one of two possible situations:
The developer was getting paid per line, and threw in that query. It was probably originally structured to take more than one line.
The developer was incompetent and this was the only way they could think of to generate a bunch of null values.
The query returns a null value from each line of the table, so the only real information in the result is the number of records in the table.
This can of course be found out a lot more efficently using:
select count(*) as Count from TblMetaData
It's possible that the developer was not at all aware of the count aggregate (or how to search the web) and tried to get the number of records while making the result as small as possible.
It often used in this expression
select * from TableA where exists
(select null from TableB where TableB.Col1=TableA.Col1)
it can be used to give the number of rows in the table TblMetaData with the column's name denoting the first letter of empty(in this case only).
like suppose you gave
Select Null as Empty from (select * from TblMetaData)
so it will give
E
n rows selected
here n is the number of rows in the table.
suppose you gave
Select Null as XYZ from (select * from TblMetaData)
then it would be same but the column's name would change like
X
n rows selected