not just another 'column invalid in select list' error - sql

i say this because i tried all the usual solutions and they just aren't working. here's what i have..
Table 1
CREATE TABLE dbo.Temp
(
PrintData nvarchar(250) NOT NULL,
Acronym nvarchar(3) NOT NULL,
Total int not null
)
this is successfully populated using 3 SELECT's with a Group By unioned together
Table 2
CREATE TABLE dbo.Result
(
PrintData nvarchar(250) NOT NULL,
Acronym nvarchar(3) NOT NULL,
Total int not null,
[Percent] decimal(7,5) not null
)
all i want to do is populate this table from Table 1 while adding the Percent column which i calculate using the following stmt..
INSERT INTO dbo.Result
(PrintData, Acronym, Total, [Percent])
select *, ((t.Total / SUM(t.Total)) * 100)
from Temp t
group by PrintData, Acronym, Total
but the Percent col comes out as 0.00000 on every row
i thought it might have something to do with the group by but if i remove it, i get that stupid error i quoted.
some sample data from table 1..
OSHIKANGO OSH 1
WINDHOEK 1 WHA 18
WINDHOEK 2 WHB 8
WINDHOEK 3 WHC 2
WINDHOEK 4 WHD 4
with this sample data, SUM(Total) is 33. what i want in table 2 is this..
OSHIKANGO OSH 1 3.03030
WINDHOEK 1 WHA 18 54.5454
WINDHOEK 2 WHB 8 24.2424
WINDHOEK 3 WHC 2 etc
WINDHOEK 4 WHD 4
seems it should be simpler than this and hope i don't have to go as far as using a Transaction/cursor loop..

Try modifying your query a bit like below, by getting the percent calculation separately and do a JOIN with it later
INSERT INTO dbo.Result (PrintData, Acronym, Total, [Percent])
select t1.PrintData,
t1.Acronym,
t1.Total,
tab.computed
from Temp t1
join
(
select PrintData,
cast(t.Total as decimal(7,5)) / SUM(t.Total) * 100 as computed
from Temp t
group by PrintData, Total
) tab on t1.PrintData = tab.PrintData;

There is casting problem, try this query :
INSERT INTO dbo.Result
SELECT PrintData,
Acronym,
Sum(Total) [total],
Round(Sum(Total) / Cast((SELECT Sum(Total)
FROM temp) AS DECIMAL(10, 4)) * 100, 4) [Percent]
FROM temp
GROUP BY PrintData,Acronym
Also I see you are group by Total too. in that case you can use this :
INSERT INTO dbo.Result
SELECT *,Round((Sum(Total)OVER(partition BY PrintData, Acronym)) / Cast(Sum(Total) OVER() AS DECIMAL(10, 4)) * 100, 4) AS [percent]
FROM temp

convert both to decimal (7,5)
INSERT INTO dbo.Result
(PrintData, Acronym, Total, [Percent])
select *, (convert(decimal(7,5),Total) /
(select SUM(convert(decimal(7,5),Total)) * 100 AS [percent] FROM temp))
from Temp
group by PrintData, Acronym, Total

Related

Count length of consecutive duplicate values for each id

I have a table as shown in the screenshot (first two columns) and I need to create a column like the last one. I'm trying to calculate the length of each sequence of consecutive values for each id.
For this, the last column is required. I played around with
row_number() over (partition by id, value)
but did not have much success, since the circled number was (quite predictably) computed as 2 instead of 1.
Please help!
First of all, we need to have a way to defined how the rows are ordered. For example, in your sample data there is not way to be sure that 'first' row (1, 1) will be always displayed before the 'second' row (1,0).
That's why in my sample data I have added an identity column. In your real case, the details can be order by row ID, date column or something else, but you need to ensure the rows can be sorted via unique criteria.
So, the task is pretty simple:
calculate trigger switch - when value is changed
calculate groups
calculate rows
That's it. I have used common table expression and leave all columns in order to be easy for you to understand the logic. You are free to break this in separate statements and remove some of the columns.
DECLARE #DataSource TABLE
(
[RowID] INT IDENTITY(1, 1)
,[ID]INT
,[value] INT
);
INSERT INTO #DataSource ([ID], [value])
VALUES (1, 1)
,(1, 0)
,(1, 0)
,(1, 1)
,(1, 1)
,(1, 1)
--
,(2, 0)
,(2, 1)
,(2, 0)
,(2, 0);
WITH DataSourceWithSwitch AS
(
SELECT *
,IIF(LAG([value]) OVER (PARTITION BY [ID] ORDER BY [RowID]) = [value], 0, 1) AS [Switch]
FROM #DataSource
), DataSourceWithGroup AS
(
SELECT *
,SUM([Switch]) OVER (PARTITION BY [ID] ORDER BY [RowID]) AS [Group]
FROM DataSourceWithSwitch
)
SELECT *
,ROW_NUMBER() OVER (PARTITION BY [ID], [Group] ORDER BY [RowID]) AS [GroupRowID]
FROM DataSourceWithGroup
ORDER BY [RowID];
You want results that are dependent on actual data ordering in the data source. In SQL you operate on relations, sometimes on ordered set of relations rows. Your desired end result is not well-defined in terms of SQL, unless you introduce an additional column in your source table, over which your data is ordered (e.g. auto-increment or some timestamp column).
Note: this answers the original question and doesn't take into account additional timestamp column mentioned in the comment. I'm not updating my answer since there is already an accepted answer.
One way to solve it could be through a recursive CTE:
create table #tmp (i int identity,id int, value int, rn int);
insert into #tmp (id,value) VALUES
(1,1),(1,0),(1,0),(1,1),(1,1),(1,1),
(2,0),(2,1),(2,0),(2,0);
WITH numbered AS (
SELECT i,id,value, 1 seq FROM #tmp WHERE i=1 UNION ALL
SELECT a.i,a.id,a.value, CASE WHEN a.id=b.id AND a.value=b.value THEN b.seq+1 ELSE 1 END
FROM #tmp a INNER JOIN numbered b ON a.i=b.i+1
)
SELECT * FROM numbered -- OPTION (MAXRECURSION 1000)
This will return the following:
i id value seq
1 1 1 1
2 1 0 1
3 1 0 2
4 1 1 1
5 1 1 2
6 1 1 3
7 2 0 1
8 2 1 1
9 2 0 1
10 2 0 2
See my little demo here: https://rextester.com/ZZEIU93657
A prerequisite for the CTE to work is a sequenced table (e. g. a table with an identitycolumn in it) as a source. In my example I introduced the column i for this. As a starting point I need to find the first entry of the source table. In my case this was the entry with i=1.
For a longer source table you might run into a recursion-limit error as the default for MAXRECURSION is 100. In this case you should uncomment the OPTION setting behind my SELECT clause above. You can either set it to a higher value (like shown) or switch it off completely by setting it to 0.
IMHO, this is easier to do with cursor and loop.
may be there is a way to do the job with selfjoin
declare #t table (id int, val int)
insert into #t (id, val)
select 1 as id, 1 as val
union all select 1, 0
union all select 1, 0
union all select 1, 1
union all select 1, 1
union all select 1, 1
;with cte1 (id , val , num ) as
(
select id, val, row_number() over (ORDER BY (SELECT 1)) as num from #t
)
, cte2 (id, val, num, N) as
(
select id, val, num, 1 from cte1 where num = 1
union all
select t1.id, t1.val, t1.num,
case when t1.id=t2.id and t1.val=t2.val then t2.N + 1 else 1 end
from cte1 t1 inner join cte2 t2 on t1.num = t2.num + 1 where t1.num > 1
)
select * from cte2

Group rows by a certain identifier and update a group id column to track which group they belong

There is a table where the group id column needs to be populated with a generated incremental number like the following example:
name batch
a 1
b 1
c 1
d 2
e 2
f 2
g 3
h 3
i 3
j 4
k 4
Order does not matter as long as the groups have the same number of elements
Looking for some ideas how this can be achieved.
What i was thinking is to build a stored procedure that iterates through the result set.
Also i have this "pseudo code" that i'm working with, but obviously has issues, and also does not do the update part just kind of selects and I was thinking to create a temp table which seems like an overkill
SELECT
name,
batch = (ROW_NUMBER() OVER(ORDER BY name)) % ((Select COUNT(*) from [abc].[dbo].[cdg]) / 30))
FROM
[abc].[dbo].[cdg] x
Try this Integer division method
Declare #param int = 3
SELECT name,
( ( Row_number()OVER(ORDER BY name) - 1 ) / #param ) + 1 as Batch
FROM tablename
If you want to derive the #param from table count then use Count(*) Over()
SELECT name,
( ( Row_number()OVER(ORDER BY name) - 1 ) / (Count(*) Over()/30) ) + 1 as Batch
FROM tablename
To update the batch column
;with update_cte as
(
SELECT name,
Batch,
( ( Row_number()OVER(ORDER BY name) - 1 ) / #param ) + 1 as gen_seq
FROM tablename
)
Update update_cte set Batch = gen_seq
Are you looking for something like this. You can use NTILE function.
drop table if exists dbo.Tile;
create table dbo.Tile (
Chr char(1)
);
insert into dbo.Tile (Chr)
values ('a'), ('b'), ('c'), ('d'), ('e')
, ('f'), ('g'), ('h'), ('i'), ('j')
, ('k');
declare #Tile int
set #Tile = ceiling((select count(t.Chr) from dbo.Tile t) / 3.)
select
*
, ntile(#Tile) over (order by t.Chr) as batch
from dbo.Tile t

how to get some records in first row in sql

I have q query like this:
Select WarehouseCode from [tbl_VW_Epicor_Warehse]
my output looks like this
WarehouseCode
Main
Mfg
SP
W01
W02
W03
W04
W05
But sometimes I want to get W04 as the first record, sometimes I want to get W01 as the first record .
How I can write a query to get some records in first row??
Any help is appreciated
you could try and select the row with the code you want to appear first by specifying a where condition to select that row alone then you can union all another select with all other rows that doesn't have this name
as follows
SELECT WarehouseCode FROM Table WHERE WarehouseCode ='W04'
UNION ALL
SELECT WarehouseCode FROM Table WHERE WarehouseCode <>'W04'
Use a parameter to choose the top row, which can be passed to your query as required, and sort by a column calculated on whether the value matches the parameter; something like the ORDER BY clause in the following:
DECLARE #Warehouses TABLE (Id INT NOT NULL, Label VARCHAR(3))
INSERT #Warehouses VALUES
(1,'W01')
,(2,'W02')
,(3,'W03')
DECLARE #TopRow VARCHAR(3) = 'W02'
SELECT *
FROM #Warehouses
ORDER BY CASE Label WHEN #TopRow THEN 1 ELSE 0 END DESC
May be you don't need to store this list in table? And you want something like this?
SELECT * FROM (VALUES ('WarehouseCode'),
('Main'),
('Mfg'),
('SP'),
('W01'),
('W02'),
('W03'),
('W04'),
('W05')) as v(s)
Here you can change order manually as you want.
As commented by #ankit bajpai
You are looking for Custom sorting that is achieve by CASE with ORDER BY statement
Whenever you want WAo4 on top you can use
ORDER BY Case When col = 'W04' THEN 1 ELSE 2 END
Example below:
Select col from
(
select 'Main' col union ALL
select 'Mfg' union ALL
select 'SP' union ALL
select 'W01' union ALL
select 'W02' union ALL
select 'W03' union ALL
select 'W04' union ALL
select 'W05'
) randomtable
ORDER BY Case When col = 'W04' THEN 1 ELSE 2 END
EDIT: AFTER MARKED AS ANSWER
IN support of #Maha Khairy because that IS MARKED AS ANSWER and the only answer which is DIFFRENT
rest all are pushing OP to use "ORDER by with case statements"
let`s use UNION ALL APPROCH
create table #testtable (somedata varchar(10))
insert into #testtable
Select col from
(
select 'W05' col union ALL
select 'Main' union ALL
select 'Mfg' union ALL
select 'SP' union ALL
select 'W01' union ALL
select 'W02' union ALL
select 'W03' union ALL
select 'W04'
) randomtable
Select * From #testtable where somedata = 'W04'
Union ALL
Select * From #testtable where somedata <> 'W04'
The result set is rendering data to the grid as requested the OP
idia is to get first all rows where equal to 'W04' is and then
not equal to 'W04' and then concatinate the result. so that rows 'W04'
will always be on the top because its used in the query first, fair enough.
, but that is not the only point to use (custom sorting/sorting) in that fasion there is one another
and a major one that is PERFORMANCE
yes
"case with order by" will never able to take advantages of KEY but Union ALL will be, to explore it more buld the test table
and check the diffrence
CREATE TABLE #Orders
(
OrderID integer NOT NULL IDENTITY(1,1),
CustID integer NOT NULL,
StoreID integer NOT NULL,
Amount float NOT NULL,
makesrowfat nchar(4000)
);
GO
-- Sample data
WITH
Cte0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
Cte1 AS (SELECT 1 AS C FROM Cte0 AS A, Cte0 AS B),--4 rows
Cte2 AS (SELECT 1 AS C FROM Cte1 AS A ,Cte1 AS B),--16 rows
Cte3 AS (SELECT 1 AS C FROM Cte2 AS A ,Cte2 AS B),--256 rows
Cte4 AS (SELECT 1 AS C FROM Cte3 AS A ,Cte3 AS B),--65536 rows
Cte5 AS (SELECT 1 AS C FROM Cte4 AS A ,Cte2 AS B),--1048576 rows
FinalCte AS (SELECT ROW_NUMBER() OVER (ORDER BY C) AS Number FROM Cte5)
INSERT #Orders
(CustID, StoreID, Amount)
SELECT
CustID = Number / 10,
StoreID = Number % 4,
Amount = 1000 * RAND(Number)
FROM FinalCte
WHERE
Number <= 1000000;
GO
lets now do the same for custid "93190"
Create NONclustered Index IX_CustID_Orders ON #Orders (CustID)
INCLUDE (OrderID ,StoreID, Amount ,makesrowfat )
WARM CHACHE RESULTS
SET STATISTICS TIME ON
DECLARE #OrderID integer
DECLARE #CustID integer
DECLARE #StoreID integer
DECLARE #Amount float
DECLARE #makesrowfat nchar(4000)
Select #OrderID =OrderID ,
#CustID =CustID ,
#StoreID =StoreID ,
#Amount =Amount ,
#makesrowfat=makesrowfat
FROM
(
Select * From #Orders where custid =93190
Union ALL
Select * From #Orders where custid <>93190
)TLB
**
--elapsed time =2571 ms.
**
DECLARE #OrderID integer
DECLARE #CustID integer
DECLARE #StoreID integer
DECLARE #Amount float
DECLARE #makesrowfat nchar(4000)
Select #OrderID =OrderID ,
#CustID =CustID ,
#StoreID =StoreID ,
#Amount =Amount ,
#makesrowfat=makesrowfat
From #Orders
ORDER BY Case When custid = 93190 THEN 1 ELSE 2 END
elapsed time = 70616 ms
**
UNION ALL performance 2571 ms. ORDER BY CASE performance
70616 ms
**
UNION ALL is a clear winner ORDER BY IS not ever nearby in performance
BUT we forgot that SQL is declarative language,
we have no direct control over how data has fetch by the sql, there is a software code ( that changes with the releases)
IN between
user and sql server database engine, which is SQL SERVER OPTIMIZER that is coded to get the data set
as specified by the USER and its has responsibility to get the data with least amount of resources. so there are chances
that you wont get ALWAYS the result in order until you specify ORDER BY
some other references:
#Damien_The_Unbeliever
Why would anyone offer an ordering guarantee except when an ORDER BY clause is included? -
there's an obvious opportunity for parallelism (if sufficient resources are available) to compute each result set in parallel and serve each result row
(from the parallel queries) to the client in whatever order each individual result row becomes available. –
Conor Cunningham:
If you need order in your query results, put in an ORDER BY. It's that simple. Anything else is like riding in a car without a seatbelt.
ALL COMMENTS AND EDIT ARE WELCOME

How to select info from row above?

I want to add a column to my table that is like the following:
This is just an example of how the table is structured, the real table is more than 10.000 rows.
No_ Name Account_Type Subgroup (New_Column)
100 Sales 3
200 Underwear 0 250 *100
300 Bikes 0 250 *100
400 Profit 3
500 Cash 0 450 *400
So for every time there is a value in 'Subgroup' I want the (New_Column) to get the value [No_] from the row above
No_ Name Account_Type Subgroup (New_Column)
100 Sales 3
150 TotalSales 3
200 Underwear 0 250 *150
300 Bikes 0 250 *150
400 Profit 3
500 Cash 0 450 *400
There are cases where the table is like the above, where two "Headers" are above. And in that case I also want the first above row (150) in this case.
Is this a case for a cursor or what do you recommend?
The data is ordered by No_
--EDIT--
Starting from the first line and then running through the whole table:
Is there a way I can store the value for [No_] where [Subgroup] is ''?
And following that insert this [No_] value in the (New_Column) in each row below having value in the [Subgroup] row.
And when the [Subgroup] row is empty the process will keep going, inserting the next [No_] value in (New_Column), that is if the next line has a value in [Subgroup]
Here is a better image for what I´m trying to do:
SQL Server 2012 suggests using Window Offset Functions.
In this case : LAG
Something like this:
SELECT [No_]
,[Name]
,[Account_Type]
,[Subgroup]
,LAG([No_]) OVER(PARTITION BY [Subgroup]
ORDER BY [No_]) as [PrevValue]
FROM table
Here is an example from MS:
http://technet.microsoft.com/en-us/library/hh231256.aspx
The ROW_NUMBER function will allow you to find out what number the row is, but because it is a windowed function, you will have to use a common table expression (CTE) to join the table with itself.
WITH cte AS
(
SELECT [No_], Name, Account_Type, Subgroup, [Row] = ROW_NUMBER() OVER (ORDER BY [No_])
FROM table
)
SELECT t1.*, t2.[No_]
FROM cte t1
LEFT JOIN cte t2 ON t1.Row = t2.Row - 1
Hope this helps.
Next query will return Name of the parent row instead of the row itself, i.e. Sales for both Sales, Underwear, Bikes; and Profit for Profit, Cash:
select ISNULL(t2.Name, t1.Name)
from table t1
left join table t2 on t1.NewColumn = t2.No
So in SQL Server 2008 i created test table with 3 values in it:
create table #ttable
(
id int primary key identity,
number int,
number_prev int
)
Go
Insert Into #ttable (number)
Output inserted.id
Values (10), (20), (30);
Insert in table, that does what you need (at least if understood correctly) looks like this:
declare #new_value int;
set #new_value = 13; -- NEW value
Insert Into #ttable (number, number_prev)
Values (#new_value,
(Select Max(number) From #ttable t Where t.number < #new_value))
[This part added] And to work with subgroup- just modify the inner select to filter out it:
Select Max(number) From #ttable t
Where t.number < #new_value And Subgroup != #Subgroup
SELECT
No_
, Name
, Account_Type
, Subgroup
, ( SELECT MAX(above.No_)
FROM TableX AS above
WHERE above.No_ < a.No_
AND above.Account_Type = 3
AND a.Account_Type <> 3
) AS NewColumn
FROM
TableX AS a

SQL Query .. a little help with AVG and MEDIAN using DISTINCT and SUM

I have a query to get the total duration of phone usage for various users...
But I need to be able to work out distinct averages for their usage.. the problem being certain users share phones and I can only grab phone info, so the call duration is repeated and this would skew the data..
So I need an average and a distinct (on the pin.Number field)... it would also be useful to do a Median if that is possible..??
This is the current query...
SELECT TOP 40 SUM(Duration) AS TotalDuration, c.Caller, oin.Name, oin.Email, pin.Number, oin.PRN
FROM Calls as c
INNER JOIN Phones as pin On c.caller = pin.id
INNER JOIN officers as oin On pin.id = oin.fk_phones
WHERE Duration <> 0 AND Placed BETWEEN '01/07/2011 00:00:00' AND '20/08/2011 23:59:59'
GROUP BY c.Caller, oin.Name, pin.Number, oin.Email, oin.PRN
ORDER BY TotalDuration DESC
Many thanks for any pointers
Here's an example of the current data I am after (but I have added the averages below which is what I am after), as you can see some users share the same phone but the number of seconds is shared between them so don't want that to influence the average (I don't want 11113 seconds repeated), so there needs to be a distinct on each phone number..
Here's a solution that implements the following idea:
Get totals per phone (SUM(Duration)).
Rank the resulting set by the total duration values (ROW_NUMBEROVER (ORDER BY SUM(Duration))).
Include one more column for the total number of rows (COUNT(*)OVER ()).
From the resulting set, get the average (AVG(TotalDuration)).
Get the median as the average between two values whose rankings are
1) N div 2 + 1,
2) N div 2 + N mod 2,
where N is the number of items, div is the integer division operator, and mod is the modulo operator.
My testing table:
DECLARE #Calls TABLE (Caller int, Duration int);
INSERT INTO #Calls (Caller, Duration)
SELECT 3, 123 UNION ALL
SELECT 1, 23 UNION ALL
SELECT 2, 15 UNION ALL
SELECT 1, 943 UNION ALL
SELECT 3, 326 UNION ALL
SELECT 3, 74 UNION ALL
SELECT 9, 49 UNION ALL
SELECT 5, 66 UNION ALL
SELECT 4, 56 UNION ALL
SELECT 4, 208 UNION ALL
SELECT 4, 112 UNION ALL
SELECT 5, 521 UNION ALL
SELECT 6, 197 UNION ALL
SELECT 8, 23 UNION ALL
SELECT 7, 22 UNION ALL
SELECT 1, 24 UNION ALL
SELECT 0, 45;
The query:
WITH totals AS (
SELECT
Caller,
TotalDuration = SUM(Duration),
rn = ROW_NUMBER() OVER (ORDER BY SUM(Duration)),
N = COUNT(*) OVER ()
FROM #Calls
GROUP BY Caller
)
SELECT
Average = AVG(TotalDuration),
Median = AVG(CASE WHEN rn IN (N / 2 + 1, N / 2 + N % 2) THEN TotalDuration END)
FROM totals
The output:
Average Median
----------- -----------
282 123
Note: In Transact-SQL, / stands for integer division if both operands are integer. The modulo operator in T-SQL is %.
I hope you can use this, I did it with temporary tables
declare #calls table (number char(4), duration int)
declare #officers table(number char(4), name varchar(10))
insert #calls values (3321,1)
insert #calls values (3321,1)
insert #calls values (3321,1)
insert #calls values (3321,42309)
insert #calls values (1235,34555)
insert #calls values (2979,31133)
insert #calls values (2324,24442)
insert #calls values (2345,11113)
insert #calls values (3422,9922)
insert #calls values (3214,8333)
insert #officers values(3321, 'Peter')
insert #officers values(1235, 'Stewie')
insert #officers values(2979, 'Lois')
insert #officers values(2324, 'Brian')
insert #officers values(2345, 'Chris')
insert #officers values(2345, 'Peter')
insert #officers values(3422, 'Frank')
insert #officers values(3214, 'John')
insert #officers values(3214, 'Mark')
Sql to get median and average
;with a as
(
select sum(duration) total_duration, number from #calls group by number
)
select avg(a.total_duration) avg_duration, c.total_duration median_duration from a
cross join (
select top 1 total_duration from (
select top 50 percent total_duration from a order by total_duration desc) b order by
total_duration) c
group by c.total_duration
Try here: https://data.stackexchange.com/stackoverflow/q/108612/
Sql To get the Total durations
select o.name, c.total_duration, c.number from #officers o join
(select sum(duration) total_duration, number from #calls group by number) c
on o.number = c.number
order by total_duration desc
Try here: https://data.stackexchange.com/stackoverflow/q/108611/