How to split a column into two columns based on the value in the another column - sql

I have below data in the Ms SQL server table.
I would like to get the output like below.
I have tried two sets of queries but it didn't helped me.
1st set query gives me the null values
Query
SELECT
[id]
, [sav]
, [cat]
, [tech]
, [asset]
, CASE
WHEN [objname] = 'FieldName'
THEN [stringvalue]
END AS [fieldname]
, CASE
WHEN [objname] = 'FieldValue'
THEN [stringvalue]
END AS [fieldvalue]
FROM [test].[dbo].[sample];
Output
2nd set query gives me 0 as field value, because i have hard coded it.
Query
SELECT
ROW_NUMBER() OVER(ORDER BY [fieldname]) AS 'id'
, [sav]
, [cat]
, [tech]
, [asset]
, [fieldname]
, 0 AS [fieldvalue]
FROM [test].[dbo].[sample] PIVOT(MAX([stringvalue]) FOR [objname] IN(
[fieldname])) [p]
WHERE [fieldname] IS NOT NULL;
Output
How to achieve it ?

You have a very arcane data structure. SQL tables are inherently unordered. From what I can tell, the SQL value is in the "next" row based on the id.
If so, you can use lead():
select . . .,
stringvalue as fieldname, next_string_value as stringvalue
from (select t.*, lead(t.stringvalue) over (order by id) as next_string_value
from t
) t
where t.objname = 'objname';
If you are really using SQL Server 2008, you can use a self-join. This does assume that the ids have no gaps in them.

Related

Can you explain the meaning of a minus in a SQL select statement?

I am working with SQL and I found this snippet, my question is: what does it mean those minus symbols (-) inside the select statement? I know is a kind of some trick, but I can't find information online about how it is used, please any insight would be welcome.
I am referring to:
SELECT - sum(st.sales) AS sales
- sum(st.orders) AS orders
- sum(st.aov) AS aov
It seems to be related to ledger tables, if you have any documentation, blog or pdf please give me the link to check it.
The full SQL looks like this:
INSERT INTO sales_test
WITH source_query AS --find the existing values in the ledger table and invert them
(
SELECT
st.og_date
, st.merchant
, st.store_name
, st.country
, st.kam
, st.class
, st.origin
, - sum(st.sales) AS sales
, - sum(st.orders) AS orders
, - sum(st.aov) AS aov
, et.source_file_name
, et.source_file_timestamp
FROM
sales_test st
INNER JOIN
ext_sales_test et
ON
city_hash(et.og_date, et.merchant, et.store_name, et.country, et.kam, et.class, et.origin) = city_hash(st.og_date, st.merchant, st.store_name, st.country, st.kam, st.class, st.origin)
AND st.og_date = et.og_date
AND st.merchant = et.merchant
GROUP BY
st.og_date
, st.merchant
, st.store_name
, st.country
, st.kam
, st.class
, st.origin
, et.source_file_name
, et.source_file_timestamp
)
, union_query AS --if we union the incoming data with the inverted existing data, we get the difference that needs to be ledgered
(
SELECT *
FROM
source_query
UNION ALL
SELECT *
FROM
ext_sales_text
)
It makes the numeric value negative(if numeric value is negative, - - is positive), in your case it first performs the sum and then it makes it negative or positive:
As an example:
USE tempdb;
GO
DECLARE #Num1 INT;
SET #Num1 = 5;
SELECT #Num1 AS VariableValue, -#Num1 AS NegativeValue;
GO
Result set:
VariableValue NegativeValue
------------- -------------
5 -5
(1 row(s) affected)
Further info here

Formatting the code from IT to match Power Query / Power BI

Could someone please confirm what code language is it and how do I edit it to fit with Power BI function to import data from SQL Database please?
Got this from IT person but when I try to paste it into Power Query it gives error messages.
SELECT distinct Z.Territory , (z.AccountNumber ) as AccountNumber , (a.AccountType ) as AccountType , (z.CompanyName ) as CompanyName , (z.AccountNumber ) as AccountNumber , (z.CompanyName ) as CompanyName , z.SubscriptionReference ,z.SubscriptionID , (isnull(z.serialnumber,r.SerialNumber) ) as SerialNumber ,case when (rpc.billingperiodalignment) = 'AlignToCharge' then (z.product) else isnull(r.ProductDescription,z.product) end as Description , (r.ProductVersion ) as ProductVersion , (r.CoverExpiryDate ) as CoverExpiryDate , (z.SubscriptionStatus ) as SubscriptionStatus ,z.SubscriptionVersion
--, MAX(z.RenewalTerm) AS RenewalTerm , (z.[SubscriptionTermType]) as [SubscriptionTermType] , (case when z.[UnitofMeasure] = 'Desktop Users' then z.[Quantity] end) as 'Desktop Users' , (z.billingperiod ) as BillingPeriod ,(rpc.BillingPeriodAlignment) BillingPeriodAlignment ,(z.[ContractEffectiveDate]) as ContractEffectiveDate ,(z.TermEndDate) as TermEndDate
It's SQL, but it's invalid, so you'll need to get it fixed. It looks like it got cut off. When it's valid you should be able to paste it into a SQL Server Management Studio query window and test it.
To use it in Power Query use Value.NativeQuery, like this;
let
Source = Sql.Database("localhost", "adventureworks2017"),
Query = "
select *
from Sales.SalesOrderHeader
",
Data = Value.NativeQuery(Source, Query, null, [EnableFolding=true])
in
Data

Combine multiple boolean columns into a single column

I am generation reports from an ERP system where users are provided with a check box which return a boolean value for each item selected. The database is hosted on SQL Server.
However, users can select Contracts with other values as well, as shown below.
I would like to capture the Categories as a single column and I don't mind having duplicate rows in the view. I would like the first row to return Contract and the second the other value selected, for the same Reference ID.
You can use apply :
select distinct t.*, tt.category
from t cross apply
( values ('Contracts', t.Contracts),
('Tender', t.Tender),
('Waiver', t.Waiver),
('Quotation', t.Quotation)
) tt(category, flag)
where flag = 1;
I guess a straightforward way is:
select *, 'Contract' as [Category] from [TableOne] where [Contract] = 1
union all select *, 'Tender' as [Category] from [TableOne] where [Tender] = 1
union all select *, 'Waiver' as [Category] from [TableOne] where [Waiver] = 1
union all select *, 'Quotation' as [Category] from [TableOne] where [Quotation] = 1
union all select *, '(none)' as [Category] from [TableOne] where [Contract]+[Tender]+[Waiver]+[Quotation] = 0
order by [Reference ID]
Note that the last line is put there just in case you need to handle the all-zero case.

Using Order By with Distinct on a Join (PLSQL)

I have written a join on some tables and I have ordered the data using two levels of ordering - one of which is the primary key of one table.
Now, with this data sorted I want to then exclude any duplicates from my data using an in-line view and the DISTINCT clause - and this is where I am coming unstuck.
I seem to be able to either sort the data OR distinct it, but never both at the same time. Is there a way around this or have I stumbled upon the SQL equivalent of the uncertainty principle?
This code returns the data sorted, but with duplicates
SELECT
ada.source_tab source_tab
, ada.source_col source_col
, ada.source_value source_value
, ada.ada_id ada_id
FROM
are_aud_data ada
, are_aud_exec_checks aec
, are_audit_elements ael
WHERE
aec.aec_id = ada.aec_id
AND ael.ano_id = aec.ano_id
AND aec.acn_id = 123456
AND ael.ael_type = 1
ORDER BY
CASE
WHEN source_tab = 'Tab type 1' THEN 1
WHEN source_tab = 'Tab type 2' THEN 2
ELSE 3
END
,ada.ada_id ASC;
This code removes the duplicates, but I lose the order...
SELECT DISTINCT source_tab, source_col, source_value FROM (
SELECT
ada.source_tab
, ada.source_col source_col
, ada.source_value source_value
, ada.ada_id ada_id
FROM
are_aud_data ada
, are_aud_exec_checks aec
, are_audit_elements ael
WHERE
aec.aec_id = ada.aec_id
AND ael.ano_id = aec.ano_id
AND aec.acn_id = 123456
AND ael.ael_type = 1
ORDER BY
CASE
WHEN source_tab = 'Tab type 1' THEN 1
WHEN source_tab = 'Tab type 2' THEN 2
ELSE 3
END
,ada.ada_id ASC
)
;
If I try and include 'ORDER BY ada_id' at the end of the outer select, I get the error message 'ORA-01791: not a SELECTed expression' which is infuriating me!!
Why don't you include ada_id at the selected fields of the outer query?
;WITH CTE AS
(
SELECT
ada.source_tab source_tab
, ada.source_col source_col
, ada.source_value source_value
, ada.ada_id ada_id
, ROW_NUMBER() OVER (PARTITION BY [COLUMNS_YOU_WANT TO BE DISTINCT]
ORDER BY [your_columns]) rn
FROM
are_aud_data ada
, are_aud_exec_checks aec
, are_audit_elements ael
WHERE
aec.aec_id = ada.aec_id
AND ael.ano_id = aec.ano_id
AND aec.acn_id = 356441
AND ael.ael_type = 1
ORDER BY
CASE
WHEN source_tab = 'Licensed Inventory' THEN 1
WHEN source_tab = 'CMDB' THEN 2
ELSE 3
END
,ada.ada_id ASC
)
select * from CTE WHERE rn<2
it seems that the ada_id is meaningless in the outer query.
you have removed all those values to boil it down to the distinct source_tab and source_col...
what would you expect the order to be?
you want maybe the minimum ada_id for each table and column set to be the driver for the order - (although the table name seems appropriate to me)
include the minimum ada_id in the inner query (you'll need a group by clause)
then reference that in the outer query and sort on it.

speed up SQL Query

I have a query which is taking some serious time to execute on anything older than the past, say, hours worth of data. This is going to create a view which will be used for datamining, so the expectations are that it would be able to search back weeks or months of data and return in a reasonable amount of time (even a couple minutes is fine... I ran for a date range of 10/3/2011 12:00pm to 10/3/2011 1:00pm and it took 44 minutes!)
The problem is with the two LEFT OUTER JOINs in the bottom. When I take those out, it can run in about 10 seconds. However, those are the bread and butter of this query.
This is all coming from one table. The ONLY thing this query returns differently than the original table is the column xweb_range. xweb_range is a calculated field column (range) which will only use the values from [LO,LC,RO,RC]_Avg where their corresponding [LO,LC,RO,RC]_Sensor_Alarm = 0 (do not include in range calculation if sensor alarm = 1)
WITH Alarm (sub_id,
LO_Avg, LO_Sensor_Alarm, LC_Avg, LC_Sensor_Alarm, RO_Avg, RO_Sensor_Alarm, RC_Avg, RC_Sensor_Alarm) AS (
SELECT sub_id, LO_Avg, LO_Sensor_Alarm, LC_Avg, LC_Sensor_Alarm, RO_Avg, RO_Sensor_Alarm, RC_Avg, RC_Sensor_Alarm
FROM dbo.some_table
where sub_id <> '0'
)
, AddRowNumbers AS (
SELECT rowNumber = ROW_NUMBER() OVER (ORDER BY LO_Avg)
, sub_id
, LO_Avg, LO_Sensor_Alarm
, LC_Avg, LC_Sensor_Alarm
, RO_Avg, RO_Sensor_Alarm
, RC_Avg, RC_Sensor_Alarm
FROM Alarm
)
, UnPivotColumns AS (
SELECT rowNumber, value = LO_Avg FROM AddRowNumbers WHERE LO_Sensor_Alarm = 0
UNION ALL SELECT rowNumber, LC_Avg FROM AddRowNumbers WHERE LC_Sensor_Alarm = 0
UNION ALL SELECT rowNumber, RO_Avg FROM AddRowNumbers WHERE RO_Sensor_Alarm = 0
UNION ALL SELECT rowNumber, RC_Avg FROM AddRowNumbers WHERE RC_Sensor_Alarm = 0
)
SELECT rowNumber.sub_id
, cds.equipment_id
, cds.read_time
, cds.LC_Avg
, cds.LC_Dev
, cds.LC_Ref_Gap
, cds.LC_Sensor_Alarm
, cds.LO_Avg
, cds.LO_Dev
, cds.LO_Ref_Gap
, cds.LO_Sensor_Alarm
, cds.RC_Avg
, cds.RC_Dev
, cds.RC_Ref_Gap
, cds.RC_Sensor_Alarm
, cds.RO_Avg
, cds.RO_Dev
, cds.RO_Ref_Gap
, cds.RO_Sensor_Alarm
, COALESCE(range1.range, range2.range) AS xweb_range
FROM AddRowNumbers rowNumber
LEFT OUTER JOIN (SELECT rowNumber, range = MAX(value) - MIN(value) FROM UnPivotColumns GROUP BY rowNumber HAVING COUNT(*) > 1) range1 ON range1.rowNumber = rowNumber.rowNumber
LEFT OUTER JOIN (SELECT rowNumber, range = AVG(value) FROM UnPivotColumns GROUP BY rowNumber HAVING COUNT(*) = 1) range2 ON range2.rowNumber = rowNumber.rowNumber
INNER JOIN dbo.some_table cds
ON rowNumber.sub_id = cds.sub_id
It's difficult to understand exactly what your query is trying to do without knowing the domain. However, it seems to me like your query is simply trying to find, for each row in dbo.some_table where sub_id is not 0, the range of the following columns in the record (or, if only one matches, that single value):
LO_AVG when LO_SENSOR_ALARM=0
LC_AVG when LC_SENSOR_ALARM=0
RO_AVG when RO_SENSOR_ALARM=0
RC_AVG when RC_SENSOR_ALARM=0
You constructed this query assigning each row a sequential row number, unpivoted the _AVG columns along with their row number, computed the range aggregate grouping by row number and then joining back to the original records by row number. CTEs don't materialize results (nor are they indexed, as discussed in the comments). So each reference to AddRowNumbers is expensive, because ROW_NUMBER() OVER (ORDER BY LO_Avg) is a sort.
Instead of cutting this table up just to join it back together by row number, why not do something like:
SELECT cds.sub_id
, cds.equipment_id
, cds.read_time
, cds.LC_Avg
, cds.LC_Dev
, cds.LC_Ref_Gap
, cds.LC_Sensor_Alarm
, cds.LO_Avg
, cds.LO_Dev
, cds.LO_Ref_Gap
, cds.LO_Sensor_Alarm
, cds.RC_Avg
, cds.RC_Dev
, cds.RC_Ref_Gap
, cds.RC_Sensor_Alarm
, cds.RO_Avg
, cds.RO_Dev
, cds.RO_Ref_Gap
, cds.RO_Sensor_Alarm
--if the COUNT is 0, xweb_range will be null (since MAX will be null), if it's 1, then use MAX, else use MAX - MIN (as per your example)
, (CASE WHEN stats.[Count] < 2 THEN stats.[MAX] ELSE stats.[MAX] - stats.[MIN] END) xweb_range
FROM dbo.some_table cds
--cross join on the following table derived from values in cds - it will always contain 1 record per row of cds
CROSS APPLY
(
SELECT COUNT(*), MIN(Value), MAX(Value)
FROM
(
--construct a table using the column values from cds we wish to aggregate
VALUES (LO_AVG, LO_SENSOR_ALARM),
(LC_AVG, LC_SENSOR_ALARM),
(RO_AVG, RO_SENSORALARM),
(RC_AVG, RC_SENSOR_ALARM)
) x (Value, Sensor_Alarm) --give a name to the columns for _AVG and _ALARM
WHERE Sensor_Alarm = 0 --filter our constructed table where _ALARM=0
) stats([Count], [Min], [Max]) --give our derived table and its columns some names
WHERE cds.sub_id <> '0' --this is a filter carried over from the first CTE in your example