How to rewrite query - sql

I need to update based on a select. The following errors with: the column '' was specified multiple times for Q
UPDATE Evolution1.DimAdministrator
SET Evolution1.DimAdministrator.ClaimSystemCodeId = 17
FROM Evolution1.DimAdministrator da INNER JOIN (
Select
ExtractId,
base.AdministratorId,
base.CardprocessorAdministratorId,
AdministratorName,
EffectiveDate,
CancelDate ,
State,
StageError ,
AdministratorKey,
CustomerKey ,
Name ,
EffectiveDateKey ,
CancelDateKey,
StateProvinceKey ,
Alias ,
NavId ,
warehouse.AdministratorId ,
warehouse.CardprocessorAdministratorId,
warehouse.ClaimSystemCodeId,
Inserted ,
Updated
FROM OneStage.OnePay.Administrator base
INNER JOIN OneWarehouse.Evolution1.DimAdministrator warehouse ON base.AdministratorId = warehouse.AdministratorId
WHERE base.ClaimSystemCodeId <> warehouse.ClaimSystemCodeId
AND base.ClaimSystemCodeId = 1
) AS Q
Help please. Thanks.

You have multiple columns with duplicate names.
Put an alias on them like this:
UPDATE Evolution1.DimAdministrator
SET Evolution1.DimAdministrator.ClaimSystemCodeId = 17
FROM Evolution1.DimAdministrator da INNER JOIN (
Select
ExtractId,
base.AdministratorId AS base_AdminID,
base.CardprocessorAdministratorId AS base_CardID,
AdministratorName,
EffectiveDate,
CancelDate ,
State,
StageError ,
AdministratorKey,
CustomerKey ,
Name ,
EffectiveDateKey ,
CancelDateKey,
StateProvinceKey ,
Alias ,
NavId ,
warehouse.AdministratorId wh_AdminID,
warehouse.CardprocessorAdministratorId AS WH_CardID,
warehouse.ClaimSystemCodeId,
Inserted ,
Updated
FROM OneStage.OnePay.Administrator base
INNER JOIN OneWarehouse.Evolution1.DimAdministrator warehouse ON base.AdministratorId = warehouse.AdministratorId
WHERE base.ClaimSystemCodeId <> warehouse.ClaimSystemCodeId
AND base.ClaimSystemCodeId = 1
) AS Q
Are you sure you don't need to JOIN Q ON something?

Related

Multiple Joins And Writing to Destination Table with BigQuery

I have the following query that works fine if I DON'T set a destination table.
SELECT soi.customer_id
, p.department
, p.category
, p.subcategory
, p.tier1
, p.tier2
, pc.bucket as categorization
, SUM(soi.price) as demand
, COUNT(1) as cnt
FROM store.sales_item soi
INNER JOIN datamart.product p ON (soi.product_id = p.product_id)
INNER JOIN daily_customer_fact.dcf_product_categorization pc
ON (p.department = pc.department
AND p.category = pc.category
AND p.subcategory = pc.subcategory
AND p.tier1 = pc.tier1
AND p.tier2 = pc.tier2)
WHERE DATE(soi.created_timestamp) < current_date()
GROUP EACH BY 1,2,3,4,5,6,7 LIMIT 10
However, if I set a destination table, it fails with
Error: Ambiguous field name 'app_version' in JOIN. Please use the table qualifier before field name.
That column exists on the store.sales_item table, but I'm not selecting nor joining to that column.
I've seen this error message before, and it points to the following:
Your query job when specifying a destination table is setting flattenResults to false.
Both of the store.sales_item and datamart.product tables contain a field named "app_version".
If so, I recommend looking at this answer:
https://stackoverflow.com/a/28996481/4001094
As well as this issue report: https://code.google.com/p/google-bigquery/issues/detail?id=459
In your case, you should be able to make your query succeed by doing something like the following, using suggestion #3 from the answer linked above. I'm unable to test it as I don't have access to your source tables, but it should be close to working with flattenResults set to false.
SELECT soi_and_p.customer_id
, soi_and_p.department
, soi_and_p.category
, soi_and_p.subcategory
, soi_and_p.tier1
, soi_and_p.tier2
, pc.bucket as categorization
, SUM(soi_and_p.price) as demand
, COUNT(1) as cnt
FROM
(SELECT soi.customer_id AS customer_id
, p.department AS department
, p.subcategory AS subcategory
, p.tier1 AS tier1
, p.tier2 AS tier2
, soi.price AS price
, soi.created_timestamp AS created_timestamp
FROM store.sales_item soi
INNER JOIN datamart.product p ON (soi.product_id = p.product_id)
) as soi_and_p
INNER JOIN daily_customer_fact.dcf_product_categorization pc
ON (soi_and_p.department = pc.department
AND soi_and_p.category = pc.category
AND soi_and_p.subcategory = pc.subcategory
AND soi_and_p.tier1 = pc.tier1
AND soi_and_p.tier2 = pc.tier2)
WHERE DATE(soi_and_p.created_timestamp) < current_date()
GROUP EACH BY 1,2,3,4,5,6,7 LIMIT 10

How do I change this sql statement to only select the first from each ID?

I have the below code to select from a database, however, I only want the first record for each unique ID. Is there a way to change the SQL to achieve this?
SELECT
[CARL_Property].ID
,[PrDoorNum]
,[PrAddress1]
,[PrAddress2]
,[PrAddress3]
,[PrAddress4]
,[PrPostcode]
,[PrRent]
,[PrAgreedRent]
,[PrCommence]
,[PrEnd]
,[PrAvailable]
,[PrGrossIncome]
,[PrCouncilTax]
,[PrInventoryFee]
,[PrLetFee]
,[PrReletFee]
,[PrDateWithdrawn]
,[Rent Review]
,CARL_Owners.OwForenames
,CARL_Owners.OwSurname
,CARL_Property_List.[ID]
,CARL_Property_List.[PrId]
,CARL_Property_List.[PLBedrooms]
,CARL_Property_List.[PlRooms]
,CARL_Property_List.[PlBathrooms]
,CARL_Property_List.[PlReceptions]
,CARL_Property_List.[PlDeposit]
,CARL_Tenant_Contacts.[Tenant Name]
,CARL_New_Tenants.[TnLeaseperiod]
,CARL_Property_List.[PlAdvertising]
,[CARL_Property_Memos].[PrNotes]
,[CARL_Safety].[PrGasInsp]
from dbo.CARL_Property Join dbo.[CARL_Property_Memos] on CARL_Property.ID=CARL_Property_Memos.PrID Join dbo.CARL_Owners on CARL_Owners.ID=CARL_Property.OwID Join dbo.CARL_PROPERTY_LIST ON dbo.CARL_PROPERTY.ID=dbo.CARL_PROPERTY_LIST.PrId Join dbo.[CARL_New_Tenants] ON CARL_New_Tenants.PrId=CARL_Property.ID JOIN CARL_Tenant_Contacts ON CARL_New_Tenants.ID = CARL_Tenant_Contacts.TnID Join [dbo].[CARL_Safety] On dbo.CARL_Property.ID=dbo.CARL_Safety.PrID
The result is as seen below.
Something along these lines I think is what you are looking for. Also, notice that I used aliases in your main query. It makes this a lot simpler to work with and reduces the amount of typing by a LOT.
with SortedResults as
(
SELECT
cp.ID
, ROW_NUMBER() over(partition by cp.ID order by pl.ID) as RowNum --order by whatever column defines "first"
, [PrDoorNum]
, [PrAddress1]
, [PrAddress2]
, [PrAddress3]
, [PrAddress4]
, [PrPostcode]
, [PrRent]
, [PrAgreedRent]
, [PrCommence]
, [PrEnd]
, [PrAvailable]
, [PrGrossIncome]
, [PrCouncilTax]
, [PrInventoryFee]
, [PrLetFee]
, [PrReletFee]
, [PrDateWithdrawn]
, [Rent Review]
, o.OwForenames
, o.OwSurname
, pl.[ID] as PL_ID
, pl.[PrId]
, pl.[PLBedrooms]
, pl.[PlRooms]
, pl.[PlBathrooms]
, pl.[PlReceptions]
, pl.[PlDeposit]
, tc.[Tenant Name]
, nt.[TnLeaseperiod]
, pl.[PlAdvertising]
, pm.[PrNotes]
, cs.[PrGasInsp]
from dbo.CARL_Property p
Join dbo.[CARL_Property_Memos] pm on p.ID = pm.PrID
Join dbo.CARL_Owners o on o.ID = p.OwID
Join dbo.CARL_PROPERTY_LIST pl ON p.ID = pl.PrId
Join dbo.[CARL_New_Tenants] nt ON nt.PrId = p.ID
JOIN CARL_Tenant_Contacts tc ON nt.ID = tc.TnID
Join [dbo].[CARL_Safety] cs On p.ID = cs.PrID
)
select *
from SortedResults
where RowNum = 1
order by ID

Exclude few selected fields on group by

I had a query that was returning member transaction information. This query has an aggregate function to calculate the amount. All is working fine according to its grouping. Now what I need to do is to add two more columns from different tables. I did try to add them unfortunately they are giving me duplicated information with tons number of records.
Can anyone help me I just want to be able to include the two fields on the query and not include them in the group by clause. And also ensure that data returned is not a duplicate
See below is the query I used.
DECLARE #LastMonthExtractID Int = 11
SELECT x.*
,lstmnth.Submission ---added
,lm_subt.SubmissionTypeDescription ---added
FROM (
SELECT MemberRef --unique key
, SiteName
, ChargePeriod
, SUM(Amount) AS Amount
, TransactionMap
, PackageCode
FROM (
SELECT MemberRef
, SiteName
, ChargePeriod
, Amount
, PackageCode
, CASE WHEN map.TransactionMap = 'JoinFee' AND lstmnth.ChargeDate <> lstmnth.JoinDate THEN 'PayPlan'
WHEN map.TransactionMap = 'MemberFee' AND lstmnth.PackageCode LIKE 'PV%' AND lstmnth.SiteID <> 15 THEN 'VitalityMF' -- must use Package and not CURRENT PACKAGE
WHEN map.TransactionMap = 'MemberFee' AND lstmnth.PackageCode LIKE 'PV%' AND lstmnth.SiteID = 15 THEN 'PlatVitalityMF' -- PLATINUM
WHEN map.TransactionMap = 'MemberFee' AND lstmnth.PackageCode LIKE 'Z%' THEN 'ZContract'
WHEN map.TransactionMap IS NULL THEN 'Other'
ELSE map.TransactionMap END AS TransactionMap
--, lstmnth.Submission
--, lm_subt.SubmissionTypeDescription --added
FROM dbo.CCX_Billing lstmnth
LEFT JOIN dbo.TransactionMap map on lstmnth.TransactionType = map.TransactionType
AND lstmnth.TransactionDescription = map.TransactionDescription
AND ISNULL (lstmnth.AnalysisCode, '') = map.AnalysisCode
WHERE lstmnth.ExtractID = #LastMonthExtractID
) l
GROUP BY SiteName, MemberRef, ChargePeriod, PackageCode, TransactionMap
) x
INNER JOIN dbo.CCX_Billing lstmnth ON lstmnth.MemberRef = x.MemberRef
LEFT JOIN dbo.CCX_Billing_PSubmission lm_sub on lstmnth.SubmissionID = lm_sub.ID
INNER JOIN dbo.CCX_Billing_SubmissionType lm_subt on lm_sub.SubmissionTypeID = lm_subt.SubmissionID --added

SQL Query to get only 1 instance of a record where one to many relationship exists

Ive got an SQL Query trying to get 1 record back when a 1 to many relationship exists.
SELECT dbo.BlogEntries.ID AS blog_entries_id, dbo.BlogEntries.BlogTitle, dbo.BlogEntries.BlogEntry, dbo.BlogEntries.BlogName,
dbo.BlogEntries.DateCreated AS blog_entries_datecreated, dbo.BlogEntries.inActive AS blog_entries_in_active,
dbo.BlogEntries.HtmlMetaDescription AS blog_entries_html_meta_description, dbo.BlogEntries.HtmlMetaKeywords AS blog_entries_html_meta_keywords,
dbo.BlogEntries.image1, dbo.BlogEntries.image2, dbo.BlogEntries.image3, dbo.BlogEntries.formSelector, dbo.BlogEntries.image1Alignment,
dbo.BlogEntries.image2Alignment, dbo.BlogEntries.image3Alignment, dbo.BlogEntries.blogEntryDisplayName, dbo.BlogEntries.published AS blog_entries_published,
dbo.BlogEntries.entered_by, dbo.BlogEntries.dateApproved, dbo.BlogEntries.approved_by, dbo.blog_entry_tracking.id AS blog_entry_tracking_id,
dbo.blog_entry_tracking.blog, dbo.blog_entry_tracking.blog_entry, dbo.BlogCategories.ID, dbo.BlogCategories.BlogCategoryName,
dbo.BlogCategories.BlogCategoryComments, dbo.BlogCategories.DateCreated, dbo.BlogCategories.BlogCategoryTitle, dbo.BlogCategories.BlogCategoryTemplate,
dbo.BlogCategories.inActive, dbo.BlogCategories.HtmlMetaDescription, dbo.BlogCategories.HtmlMetaKeywords, dbo.BlogCategories.entry_sort_order,
dbo.BlogCategories.per_page, dbo.BlogCategories.shorten_page_content, dbo.BlogCategories.BlogCategoryDisplayName, dbo.BlogCategories.published,
dbo.BlogCategories.blogParent
FROM dbo.BlogEntries LEFT OUTER JOIN
dbo.blog_entry_tracking ON dbo.BlogEntries.ID = dbo.blog_entry_tracking.blog_entry LEFT OUTER JOIN
dbo.BlogCategories ON dbo.blog_entry_tracking.blog = dbo.BlogCategories.ID
i have some records assigned to 2 different blogcategories, and when i query everything it returns duplicate records.
How do i only return 1 instance of a blog?
Try this one -
SELECT blog_entries_id = be.Id
, be.BlogTitle
, be.BlogEntry
, be.BlogName
, blog_entries_datecreated = be.DateCreated
, blog_entries_in_active = be.inActive
, blog_entries_html_meta_description = be.HtmlMetaDescription
, blog_entries_html_meta_keywords = be.HtmlMetaKeywords
, be.image1
, be.image2
, be.image3
, be.formSelector
, be.image1Alignment
, be.image2Alignment
, be.image3Alignment
, be.blogEntryDisplayName
, blog_entries_published = be.published
, be.entered_by
, be.dateApproved
, be.approved_by
, blog_entry_tracking_id = bet.Id
, bet.blog
, bet.blog_entry
, bc2.Id
, bc2.BlogCategoryName
, bc2.BlogCategoryComments
, bc2.DateCreated
, bc2.BlogCategoryTitle
, bc2.BlogCategoryTemplate
, bc2.inActive
, bc2.HtmlMetaDescription
, bc2.HtmlMetaKeywords
, bc2.entry_sort_order
, bc2.per_page
, bc2.shorten_page_content
, bc2.BlogCategoryDisplayName
, bc2.published
, bc2.blogParent
FROM dbo.BlogEntries be
LEFT JOIN dbo.blog_entry_tracking bet ON be.Id = bet.blog_entry
OUTER APPLY (
SELECT TOP 1 *
FROM dbo.BlogCategories bc
WHERE bet.blog = bc.Id
) bc2
Also, I would like to mention that in this case, using of aliases in the column names decreases the size of your query and makes it more convenient for understanding.
if you just need one record back, you can use
SELECT TOP 1 dbo.BlogEntries.ID AS blog_entries_id, dbo.Bl.... (same as you have now).
it is more efficient than SELECT DISTINCT
Here is a Northwind Example.
It will return only 1 row in the Order Detail table for each Order.
Use Northwind
GO
Select COUNT(*) from dbo.Orders
select COUNT(*) from dbo.[Order Details]
select * from dbo.Orders ord
join
(select ROW_NUMBER() OVER(PARTITION BY OrderID ORDER BY UnitPrice DESC) AS "MyRowID" , * from dbo.[Order Details] innerOD) derived1
on ord.OrderID = derived1.OrderID
Where
derived1.MyRowID = 1
Order by ord.OrderID

SQL Server view with a 'select where x is not null' takes ages to complete

I have a complex view, which is described here: View of multiple tables. Need to remove "doubles" defined by 1 table
I used a Cross Apply in it, and the code is this: (please do check the url above to understand the view)
SELECT dbo.InstellingGegevens.INST_SUBTYPE
, dbo.InstellingGegevens.INST_BRON
, dbo.InstellingGegevens.INST_INSTELLINGSNUMMER
, dbo.InstellingGegevens.INST_NAAM
, dbo.InstellingGegevens.INST_KORTENAAM
, dbo.InstellingGegevens.INST_VESTIGINGSNAAM
, dbo.InstellingGegevens.INST_ROEPNAAM
, dbo.InstellingGegevens.INST_STATUUT
, dbo.InstellingGegevens.ONDERWIJSNIVEAU_REF
, dbo.InstellingGegevens.ONDERWIJSSOORT_REF
, dbo.InstellingGegevens.DATUM_TOT
, dbo.InstellingGegevens.DATUM_VAN
, dbo.InstellingGegevens.VERBOND_REF
, dbo.InstellingGegevens.VSKO_LID
, dbo.InstellingGegevens.NET_REF
, dbo.Instellingen.Inst_ID
, dbo.Instellingen.INST_TYPE
, dbo.Instellingen.INST_REF
, dbo.Instellingen.INST_LOC_REF
, dbo.Instellingen.INST_LOCNR
, dbo.Instellingen.Opt_KalStandaard
, dbo.InstellingTelecom.INST_TEL
, dbo.InstellingTelecom.INST_FAX
, dbo.InstellingTelecom.INST_EMAIL
, dbo.InstellingTelecom.INST_WEB
, dbo.InstellingAdressen.SOORT
, dbo.InstellingAdressen.STRAAT
, dbo.InstellingAdressen.POSTCODE
, dbo.InstellingAdressen.GEMEENTE
, dbo.InstellingAdressen.GEM_REF
, dbo.InstellingAdressen.FUSIEGEM_REF
, dbo.InstellingAdressen.FUSIEGEM
, dbo.InstellingAdressen.ALFA_G
, dbo.InstellingAdressen.PROVINCIE
, dbo.InstellingAdressen.BISDOM
, dbo.InstellingAdressen.ARRONDISSEMENT
, dbo.InstellingAdressen.GEWEST
, dbo.InstellingContPersDirecteurs.AANSPREKING
, dbo.InstellingContPersDirecteurs.CONTACTPERSOON
, dbo.InstellingContPersDirecteurs.FUNCTIE
, InstellingLogin.Inst_Gebruikersnaam
, InstellingLogin.Inst_Concode
, InstellingLogin.Inst_DirCode
, InstellingLogin.DOSSNR
, InstellingLogin.Instelling_ID
FROM dbo.InstellingGegevens
RIGHT OUTER JOIN dbo.Instellingen
ON dbo.InstellingGegevens.INST_TYPE = dbo.Instellingen.INST_TYPE
AND dbo.InstellingGegevens.INST_REF = dbo.Instellingen.INST_REF
AND dbo.InstellingGegevens.INST_LOC_REF = dbo.Instellingen.INST_LOC_REF
AND dbo.InstellingGegevens.INST_LOCNR = dbo.Instellingen.INST_LOCNR
LEFT OUTER JOIN dbo.InstellingTelecom
ON dbo.InstellingGegevens.INST_TYPE = dbo.InstellingTelecom.INST_TYPE
AND dbo.InstellingGegevens.INST_REF = dbo.InstellingTelecom.INST_REF
AND dbo.InstellingGegevens.INST_LOC_REF = dbo.InstellingTelecom.INST_LOC_REF
LEFT OUTER JOIN dbo.InstellingAdressen
ON dbo.InstellingGegevens.INST_TYPE = dbo.InstellingAdressen.INST_TYPE
AND dbo.InstellingGegevens.INST_REF = dbo.InstellingAdressen.INST_REF
AND dbo.InstellingGegevens.INST_LOC_REF = dbo.InstellingAdressen.INST_LOC_REF
LEFT OUTER JOIN dbo.InstellingContPersDirecteurs
ON dbo.InstellingGegevens.INST_TYPE = dbo.InstellingContPersDirecteurs.INST_TYPE
AND dbo.InstellingGegevens.INST_REF = dbo.InstellingContPersDirecteurs.INST_REF
AND dbo.InstellingGegevens.INST_LOC_REF = dbo.InstellingContPersDirecteurs.INST_LOC_REF
CROSS APPLY
(SELECT TOP (1) *
FROM InstellingLogin AS il
WHERE Instellingen.INST_LOC_REF = il.Inst_Loc_REF
AND Instellingen.INST_LOCNR = il.Inst_Loc_Nr
AND Instellingen.INST_REF = il.Inst_InstellingIKON_REF
AND Instellingen.INST_TYPE = il.Inst_InstellingIKONType
ORDER BY CASE
WHEN il.datum_tot IS NULL
THEN 0 ELSE 1
END
, il.datum_tot DESC) InstellingLogin
This view returns me about 5.5k rows, in about 1s. This is fast!
However!
When I call this view with a where clause:
SELECT *
FROM [Tink].[dbo].[InstellingAlleDetails]
where gemeente is not null and (DATUM_TOT is null or DATUM_TOT > GETDATE())
order by GEMEENTE, POSTCODE,STRAAT, INST_NAAM
it takes 1min 20s to return all rows.
When I drop the gemeente is not null part, it takes again 1s.
Gemeente is a varchar(255). I also tried it with Inst_Naam is not null and that also took about 1min 30s.
Why does this is not null take so much time? And more importantly: how do I fix this?
I don't know why. Probably SQL Server comes up with a query plan that is not so good.
You could try to first run the query without gemeente is not null and put the result in a temp table and then query the temp table with gemeente is not null.
select *
into #TempTable
from YourView
select *
from #TempTable
where gemeente is not null
drop table #TempTable
First check the execution plans on both the query with and without that is not null and see the differences.
BTW are any of these joins to other views? That can cause tremendous performance problems.