SQL:Merge two different query - sql

I am pretty new at SQL!
I have two querys from different tables. This is my first query:
SELECT cli.CLIENTE as clientet,cli.RAZON as nom,
SUM(distinct par.[RECAPTACIO]*0.9) AS recaudacionSINIVA,
SUM(distinct par.canvi) as cambiocargado,
SUM(distinct par.consum) as CC_recaptacio,count(distinct par.DATA) as visita,
SUM((c.quantitat + c.caixes * c.uxcaja) * c.pvp) as CC_Caducados
FROM [V30].[dbo].[parmaq] as par,[V30].[dbo].[clientes] as cli , V30.dbo.parcons as c
where cli.CLIENTE=par.EMPRESA and par.enllas=c.enllas
and par.data > '13-11-2015' and par.data < '20-11-2015'
group by cli.CLIENTE,cli.RAZON
And this is the result:
This is my second query:
SELECT cli.cliente as clientet, cli.RAZON, SUM( distinct m1.IMPORTE) as FACTURACION,
SUM(distinct m2.coste * m2.canti) as CC_facturacio
FROM [V30].[dbo].[clientes] as cli, [V30].[dbo].[MESTA1] as m1,V30.dbo.MESTA2 as m2
where m2.albaran=m1.albaran and m1.CLIENTE=cli.CLIENTE
and m1.FECHA > '13-11-2015' and m1.FECHA < '30-11-2015'
group by cli.CLIENTE,cli.razon order by cli.cliente
And the result of the second query:
On the sql there is no way of linking the tables with primary key! and the results from the querys one and two are fine. What I want to do is combining the results using the column "clientet". I have to say that you can get different numbers of rows.
The goal is to have all this information on the same result. For example, if "clientet" have results on both querys, the final result should be all the columns from the first query + "FACTURACION" and "CC_facturacio" from the second query.
Hope you can help me, I have been trying Inner Joins and sub-querys but there is no way to get this.

Two options:
As jarlh suggested, you could do a union as long as you put place holders in each query for the missing columns that don't exist and make sure they are in the same order.
You could do a full outer join of your two queries. You may want to add coalesce to each other column, depending on whether you want nulls to appear for columns when a row doesn't exist in both tables.
The example below assumes clientet is the joining column:
SELECT COALESCE(TBL_1.clientet, TBL_2.clientet) as clientet
, TBL_1.nom
, TBL_1.recaudacionSINIVA
, TBL_1.cambiocargado
, TBL_1.CC_recaptacio
, TBL_1.visita
, TBL_1.CC_Caducados
, TBL_2.FACTURACION
, TBL_2.CC_facturacio
FROM
(SELECT cli.CLIENTE as clientet
, cli.RAZON as nom
, SUM(distinct par.[RECAPTACIO]*0.9) AS recaudacionSINIVA
, SUM(distinct par.canvi) as cambiocargado
, SUM(distinct par.consum) as CC_recaptacio
, count(distinct par.DATA) as visita
, SUM((c.quantitat + c.caixes * c.uxcaja) * c.pvp) as CC_Caducados
FROM [V30].[dbo].[parmaq] as par,[V30].[dbo].[clientes] as cli , V30.dbo.parcons as c
where cli.CLIENTE=par.EMPRESA and par.enllas=c.enllas
and par.data > '13-11-2015' and par.data < '20-11-2015'
group by cli.CLIENTE,cli.RAZON) TBL_1
FULL OUTER JOIN
(SELECT cli.cliente as clientet
, cli.RAZON
, SUM( distinct m1.IMPORTE) as FACTURACION
, SUM(distinct m2.coste * m2.canti) as CC_facturacio
FROM [V30].[dbo].[clientes] as cli, [V30].[dbo].[MESTA1] as m1,V30.dbo.MESTA2 as m2
where m2.albaran=m1.albaran and m1.CLIENTE=cli.CLIENTE
and m1.FECHA > '13-11-2015' and m1.FECHA < '30-11-2015'
group by cli.CLIENTE,cli.razon) TBL_2
ON TBL_1.clientet = TBL_2.clientet
ORDER BY COALESCE(TBL_1.clientet, TBL_2.clientet)

Related

SQL CASE WHEN ELSE not working in AWS Athena

I have the script below setup in AWS Athena, the goal is to replace some budget numbers (total) with 0 if they are within a certain category (costitemid). I'm getting the following error in AWS Athena and could use some advice as to why it isn't working. Is the problem that I need to repeat everything in the FROM and GROUP BY in the WHEN and ELSE? Code below the error. Thank you!
SYNTAX_ERROR: line 6:9: 'projectbudgets.projectid' must be an aggregate expression or appear in GROUP BY clause
This query ran against the "acorn-prod-reports" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 077f007b-61a0-4f6b-aa1f-dd38bb401218
SELECT
CASE
WHEN projectbudgetlineitems.costitemid IN (462561,462562,462563,462564,462565,462566,478030) THEN (
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(((0 * projectbudgetlineitems.unitcost) * (projectbudgetlineitems.costshare * 1E-2))) total
)
ELSE (
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(((projectbudgetlineitems.quantity * projectbudgetlineitems.unitcost) * (projectbudgetlineitems.costshare * 1E-2))) total
)
END
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears projectbudgetyears ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems projectbudgetlineitems ON (projectbudgetyears.id = projectbudgetlineitems.projectbudgetyearid))
--WHERE (((projectbudgetlineitems.costitemid <> 478030) AND (projectbudgetlineitems.costitemid < 462561)) OR (projectbudgetlineitems.costitemid > 462566))
GROUP BY projectbudgets.projectid, projectbudgetyears.year, projectbudgetyears.status
Your syntax is wrong (at least according to most SQL dialects.) You can't generally say "SELECT CASE WHEN (condition) THEN (this select clause) ELSE (that select clause) END FROM (tables)"
You can only use CASE to calculate a single value.
But it looks as if the only change between your two inner SELECT clauses is whether you use 0 or the quantity in the final multiplication. And that is perfect for a CASE!
I do not guarantee this will work right off the bat, because I don't have your setup or an idea of your table layout. However, it's a step in the right direction:
SELECT
projectbudgets.projectid
, projectbudgetyears.year fiscalYear
, projectbudgetyears.status
, "sum"(
((
CASE
WHEN projectbudgetlineitems.costitemid IN (462561,462562,462563,462564,462565,462566,478030)
THEN 0
ELSE projectbudgetlineitems.quantity
END * projectbudgetlineitems.unitcost
) * (
projectbudgetlineitems.costshare * 1E-2
))) total
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN
"acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears projectbudgetyears
ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems projectbudgetlineitems
ON (projectbudgetyears.id = projectbudgetlineitems.projectbudgetyearid))
GROUP BY
projectbudgets.projectid, projectbudgetyears.year, projectbudgetyears.status
This could solve your problem if you want to sum the items for each project and year and status except for certain line items. Here, it is correct to use a "where" condition and not "case when" :
SELECT
projectbudgets.projectid,
projectbudgetyears.year,
projectbudgetyears.status,
"sum"(((projectbudgetlineitems.quantity * projectbudgetlineitems.unitcost) *
(projectbudgetlineitems.costshare * 1E-2))) total
FROM
(("acorn-prod-etl".target_acorn_prod_acorn_projectbudgets projectbudgets
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetyears
projectbudgetyears ON (projectbudgets.id = projectbudgetyears.projectbudgetid))
INNER JOIN "acorn-prod-etl".target_acorn_prod_acorn_projectbudgetlineitems
projectbudgetlineitems ON (projectbudgetyears.id =
projectbudgetlineitems.projectbudgetyearid))
WHERE projectbudgetlineitems.costitemid NOT IN
(462561,462562,462563,462564,462565,462566,478030)
GROUP BY projectbudgets.projectid, projectbudgetyears.year,
projectbudgetyears.status
;

Column ambigously defined in subquery with join on another subquery

I keep getting a column ambiguously defined error when joining two sub queries. However I have defined all my columns properly. I want to get all the data from the first query and add some data to it where available. How can this be fixed?
SELECT
sq2.month,
sq1.PRIMARY_MER_NUM ,
sq1.PRIMARY_EXT_MID ,
sq1.MER_DBA_NAM,
sq1.CLG_NUM,
sq1.ENT_NUM,
sq1.ENT_NAM,
sq1.MER_OPN_DTE,
sq1.MER_CLS_DTE,
sq1.MER_FST_DPST_DTE,
sq1.CLG_NUM ,
sq1.ENT_NUM,
sq2.gross_volume,
sq2.transaction_count
FROM
(SELECT DISTINCT
PRIMARY_MER_NUM ,
PRIMARY_EXT_MID ,
MER_DBA_NAM,
CLG_NUM,
ENT_NUM,
ENT_NAM,
MER_OPN_DTE,
MER_CLS_DTE,
MER_FST_DPST_DTE,
CLG_NUM ,
ENT_NUM
FROM
bi.t_mer_dim_na
WHERE
CLG_NUM = 7
AND ENT_NUM IN ('45810', '45811', '46849', '45948', '45824',
'46911', '45509', '46845', '48902')
) sq1
LEFT JOIN
(SELECT
TRUNC(BAT_REF_DTE, 'MM') AS month,
MER_NUM,
SUM(bat_prd_trn_dr_amt + bat_prd_trn_cr_amt) AS gross_volume,
SUM(bat_item_num) AS transaction_count
FROM
TDS.BAT_T3
WHERE
1 = 1
AND bat_ref_dte >= TRUNC(sysdate, 'MM')
GROUP BY
TRUNC(BAT_REF_DTE, 'MM'), MER_NUM) SQ2 ON sq1.primary_mer_num = sq2.MER_NUM;
You have CLG_NUM and ENT_NUM selected twice in your first derived table SQL1
FROM (
select DISTINCT
PRIMARY_MER_NUM ,
PRIMARY_EXT_MID ,
MER_DBA_NAM,
CLG_NUM, --1
ENT_NUM, --1
ENT_NAM,
MER_OPN_DTE,
MER_CLS_DTE,
MER_FST_DPST_DTE,
CLG_NUM, --2
ENT_NUM --2
from bi.t_mer_dim_na
That makes selecting sql1.CLG_NUM and sql1.ENT_NUM ambiguous in your outer select (where you also select both twice)

Use of MAX function in SQL query to filter data

The code below joins two tables and I need to extract only the latest date per account, though it holds multiple accounts and history records. I wanted to use the MAX function, but not sure how to incorporate it for this case. I am using My SQL server.
Appreciate any help !
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from
Property.dbo.PROP
inner join
Property.dbo.PROP_DATA on Property.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where
(PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
order by
PROP.EffDate DESC
Assuming your DBMS supports windowing functions and the with clause, a max windowing function would work:
with all_data as (
select
PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label,
max (PROP.EffDate) over (partition by PROP.PolNo) as max_date
from Actuarial.dbo.PROP
inner join Actuarial.dbo.PROP_DATA
on Actuarial.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where (PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV'))
and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
and (PROP.Status = 'Bound')
and (Prop.FileTime = Max(Prop.FileTime))
)
select
FileName, InsName, Status, FileTime, SubmissionNo,
PolNo, EffDate, ExpDate, Region, UnderWriter, Data, Label
from all_data
where EffDate = max_date
ORDER BY EffDate DESC
This also presupposes than any given account would not have two records on the same EffDate. If that's the case, and there is no other objective means to determine the latest account, you could also use row_numer to pick a somewhat arbitrary record in the case of a tie.
Using straight SQL, you can use a self-join in a subquery in your where clause to eliminate values smaller than the max, or smaller than the top n largest, and so on. Just set the number in <= 1 to the number of top values you want per group.
Something like the following might do the trick, for example:
select
p.FileName
, p.InsName
, p.Status
, p.FileTime
, p.SubmissionNo
, p.PolNo
, p.EffDate
, p.ExpDate
, p.Region
, p.Underwriter
, pd.Data
, pd.Label
from Actuarial.dbo.PROP p
inner join Actuarial.dbo.PROP_DATA pd
on p.FileID = pd.FileID
where (
select count(*)
from Actuarial.dbo.PROP p2
where p2.FileID = p.FileID
and p2.EffDate <= p.EffDate
) <= 1
and (
pd.Label in ('Occupancy' , 'OccupancyTIV')
and p.Status = 'Bound'
)
ORDER BY p.EffDate DESC
Have a look at this stackoverflow question for a full working example.
Not tested
with temp1 as
(
select foo
from bar
whre xy = MAX(xy)
)
select PROP.FileName,PROP.InsName, PROP.Status,
PROP.FileTime, PROP.SubmissionNo, PROP.PolNo,
PROP.EffDate,PROP.ExpDate, PROP.Region,
PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from Actuarial.dbo.PROP
inner join temp1 t
on Actuarial.dbo.PROP.FileID = t.dbo.PROP_DATA.FileID
ORDER BY PROP.EffDate DESC

Adding argument to SELECT statement with UNION changes record number

When I add a new argument to the SELECT clause of a UNION, I get more records... how can this be? Isn't a UNION just mashing them together? Example:
EDIT: They're absolutely distinct. the code column is either "IN" or "OUT", and that's what I'm using to separate the two.
EDIT2: UNION ALL gives me 80 records, like it should, but it's odd because my two SELECT statements are absolutely distinct.
FINAL EDIT: Ultimate problem was records within one of my SELECT statements being not DISTINCT, not between the two SELECT statements. Thanks all.
-- Yields 76 records
SELECT
f.date
, f.code
, f.cost
FROM a.fact f
WHERE f.code = 'IN'
UNION
SELECT
f2.date
, f2.code
, f2.cost
FROM a.fact2 f2
WHERE f2.code = 'OUT'
;
-- Yields 80 records
SELECT
f.key
, f.date
, f.code
, f.cost
FROM a.fact f
WHERE f.code = 'IN'
UNION
SELECT
f2.key
, f2.date
, f2.code
, f2.cost
FROM a.fact2 f2
WHERE f2.code = 'OUT'
;
change UNION to UNION ALL and you should get the same results. UNION selects distinct rows, UNION ALL should select all.
By default UNION selects distinct results, there must be duplicates between your result sets.
The union all will solve your problem, as mentioned in a previous answer, it selects the distinct values only
References Oracle docs.

Why I need Group by in this simple query?

UPDATE :
-----
the error might be in sum(si.amt_pd) from item table (as there is no relation) :
select SUM(si.amt_pd)amt_pd from [HMIS_REPORTING].HMIS_RPT_ME.dbo.item i
where
is there a work around?
----------
I am trying to run this query. The query just fetches the amount of a month based on some tables. It is just a part of a big query.
select s.sales_Contract_Nbr
, s.Sales_Id
, s.Sale_Dt
, YEAR(s.Sale_Dt) 'YEAR'
, MONTH(s.Sale_Dt) 'MONTH'
, s.Sales_Need_TYpe_Cd
, s.Sales_Status_Cd
, si.Posted
, s.location_Cd
, jan2011 = (
select SUM(si.amt_pd)amt_pd
from [HMIS_REPORTING].HMIS_RPT_ME.dbo.item i
where i.Item_Id = si.Product_Item_ID
and i.Item_Cd <> '*INT'
and convert(varchar(10),SI.Sales_Item_Dt,126) >= '2011-01-01'
and convert(varchar(10),SI.Sales_Item_Dt,126) >= '2011-01-31'
) INTO dbo.#a_acomparision
FROM [HMIS_REPORTING].HMIS_RPT_ME.dbo.Sales S
, [HMIS_REPORTING].HMIS_RPT_ME.dbo.Sales_Item SI
WHERE SI.Sales_Id = S.Sales_Id
and s.Sales_Contract_Nbr in (
select distinct (Sales_Contract_Nbr)
from mountainviewContracts
where Sales_Contract_Nbr <> '')
but I am getting the following error message.
Msg 8120, Level 16, State 1, Line 1
Column 'HMIS_REPORTING.HMIS_RPT_ME.dbo.Sales.Sales_Contract_Nbr' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I just can't understand why my query should have a group by for sales_contract_nbr and even if I put in the group by clause it tells me that inner query si.Product_item_id and SI.sales_item_dt should also be contained in group by clause.
Please help me out.
Thanks in advance
This is a very subtle problem. However, I think the subquery should be:
select SUM(i.amt_pd)amt_pd from [HMIS_REPORTING].HMIS_RPT_ME.dbo.item i
That is, the alias should be i not si.
What is happening is that the sum in the subquery is on a value in the outer query. So, the SQL compiler assumes an aggregation query. As soon as the first column is found that is not an aggregation, it complains with the message that you have.
By the way, you should use proper join syntax, so you from clause looks like:
FROM [HMIS_REPORTING].HMIS_RPT_ME.dbo.Sales S join
[HMIS_REPORTING].HMIS_RPT_ME.dbo.Sales_Item SI
on SI.Sales_Id = S.Sales_Id
As #Gordon Linoff says, this is almost certainly because the query optimizer is treating this like a SUM operation, normalizing away the subquery for "jan2001".
If the amt_pd column is present in the ITEM table, Gordon's solution is the right one.
If not, you have to add the group by statement, as below.
select s.sales_Contract_Nbr
, s.Sales_Id
, s.Sale_Dt
, YEAR(s.Sale_Dt) 'YEAR'
, MONTH(s.Sale_Dt) 'MONTH'
, s.Sales_Need_TYpe_Cd
, s.Sales_Status_Cd
, si.Posted
, s.location_Cd
, jan2011 = (
select SUM(si.amt_pd)amt_pd
from [HMIS_REPORTING].HMIS_RPT_ME.dbo.item i
where i.Item_Id = si.Product_Item_ID
and i.Item_Cd <> '*INT'
and convert(varchar(10),SI.Sales_Item_Dt,126) >= '2011-01-01'
and convert(varchar(10),SI.Sales_Item_Dt,126) >= '2011-01-31'
) INTO dbo.#a_acomparision
FROM [HMIS_REPORTING].HMIS_RPT_ME.dbo.Sales S
, [HMIS_REPORTING].HMIS_RPT_ME.dbo.Sales_Item SI
WHERE SI.Sales_Id = S.Sales_Id
and s.Sales_Contract_Nbr in (
select distinct (Sales_Contract_Nbr)
from mountainviewContracts
where Sales_Contract_Nbr <> '')
GROUP BY s.sales_Contract_Nbr
, s.Sales_Id
, s.Sale_Dt
, YEAR
, MONTH
, s.Sales_Need_TYpe_Cd
, s.Sales_Status_Cd
, si.Posted
, s.location_Cd