Say I have a table called:
TableA
The following columns exist in the table are:
Column1, Column2, Column3
what I am trying to accomplish is to see how many records are not null.
to do this I have the following case statement:
sum(Case when Column1 is not null then 1 else 0 end)
What I want is the above case statement for every table that exists from a list provided and to be run for each columns that exists in the table.
So for the above example the case statment will run for Column1, Column2 and Column3 as there are 3 columns in that particular table etc
But I want to specfiy a list of tables to loop through executing the logic above
create procedure tab_cols (#tab nvarchar(255))
as
begin
declare #col_count nvarchar(max) = ''
,#col nvarchar(max) = ''
select #col_count += case ORDINAL_POSITION when 1 then '' else ',' end + 'count(' + QUOTENAME(COLUMN_NAME,']') + ') as ' + QUOTENAME(COLUMN_NAME,']')
,#col += case ORDINAL_POSITION when 1 then '' else ',' end + QUOTENAME(COLUMN_NAME,']')
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME = #tab
order by ORDINAL_POSITION
declare #stmt nvarchar(max) = 'select * from (select ' + #col_count + ' from ' + #tab + ') t unpivot (val for col in (' + #col + ')) u'
exec sp_executesql #stmt
end
Wouldn't it be easy as this?
SELECT AccountID
,SUM(Total) AS SumTotal
,SUM(Profit) AS SumProfit
,SUM(Loss) AS SumLoss
FROM tblAccount
GROUP BY AccountID
If I understand this correctly you want to get the sums, but not for all rows in one go but for each accountID separately. This is what GROUP BY is for...
If ever possible try to avoid loops, cursors and other procedural approaches...
UPDATE: Generic approach for different tables
With different tables you will - probably - need exactly the statement I show above, but you'll have to generate it dynamically and use EXEC to execute it. You can go through INFORMATION_SCHEMA.COLUMNS to get the columns names...
But:
How should this script know generically, which columns should be summed up? You might head for data_type like 'decimal%' or similar...
What about the other columns and their usage in GROUP BY?
How would you want to place aliases
How do you want to continue with a table of unknown structure?
To be honest: I think, there is no real-generic-one-for-all approach for this...
Related
Currently I have prepared a sting of columns which can be added (hence the need for the dynamic query.
I have #cols which can be printed with an output like "Color","Size","Width"
I then have a SELECT/COUNT statement which needs to look like as follows...
SELECT
Product_code,
count(distinct [Color]),
count(distinct [Size]),
count(distinct [Width])
FROM.....
I need of the columns that I have in my string of columns to be counted with distinct..
Also would be even better if I could add a AS with the name of each of these in here too!
Many help is much appreciated - my SQL are OK but the dynamic bit turns me blue!
Cheers.
Convert your comma seperated list to a table first. See this
Assuming that table name to be ListOfColumns
DECLARE #Query VARCHAR( 1000 ) = '';
SELECT #Query+=#Query + ', COUNT(DISTINCT ' + COLUMN_NAME + ') AS ' + COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN ListOfColumns d ON c.COLUMN_NAME = d.ColName
WHERE TABLE_NAME = 'MyTable';
SET #Query = 'SELECT ' + STUFF( #query,1,3,'' ) + ' FROM Tbl';
EXEC ( #Query );
I'm trying to write a query in which I can select data from a series of tables. I want to be able to pull those table names FROM ANOTHER TABLE; I don't want to just write
select * from tableA union select * from tableB etc.
A further restriction that's complicating the issue is that my query MUST start with select.
I've tried to use OPENQUERY within the select statement but the server I'm trying to access is 'not configured for DATA ACCESS.'
You can do something like this:
DECLARE #SQL AS VARCHAR(MAX);
SELECT #SQL = COALESCE(#SQL + ' ', '') +
'SELECT * FROM ' + TableName +
CASE
WHEN TableName = MAX(TableName) OVER () THEN ''
ELSE ' UNION ALL '
END
FROM TableNames;
EXEC(#SQL);
Please help me with this. I am totally stuck. I have coders block or something.
I have the following table
ID Name Cost Included
---- ---------- ------- ----------
1 Package1 10.00 Yes
2 Package2 20.00 No
3 Package3 20.00 Yes
I would like to crosstab this information, to display like the following example,there will be more columns in the table.
Type Package1 Package2 Package3
----- ------------ ----------- ----------
Name Package1 Package2 Package3
Cost 10.00 20.00 30.00
Included Yes No Yes
It seems to me that you are trying to build a product comparison list. If this is true, you might unpivot the table first and then join individual records together.
The 'transponded' part unpivots the columns. All columns must be of compatible types or converted to one. I choose varchar(100). transponded returns table with three columns, ID from ProductInfo, Type as column name and Value as value of corresponding column.
Select part joins together info on as many product as demanded by adding another left join transponded tn on t1.Type = tnType and tn.ID = #parametern. This part seems as a hassle, but when I tried to do this part with pivot I failed to get column in proper order - pivot sorted names in Type. It would however demand dynamic sql generation. This solution is fixed providing that you add enough joins for maximum products you wish to compare at once. I belive it would not be over 5.
=1, =2 and =3 should be replaced by parameters. The query should be hosted in stored procedure.
; with transponded as
(
select ID, Type, Value
from
(
select ID,
Name,
cast (Cost as varchar(100)) Cost,
cast (case when Included = 1 then 'Yes' else 'No' end as varchar(100)) Included
from ProductInfo
) p
unpivot (Value for Type in (Name, Cost, Included) ) a
)
select t1.Type,
t1.Value Product1,
t2.Value Product2,
t3.Value Product3
from transponded t1
left join transponded t2
on t1.Type = t2.Type
and t2.id = 2
left join transponded t3
on t1.Type = t3.Type
and t3.id = 3
where t1.id = 1
In short, transpond one record at time and join to another transponded record by Type column.
Oh, and here is a Sql Fiddle playground.
There is no easy way to do this, as the pivot will need to be aggregated by column. Given that adding columns to the input table would cause a maintenance issue where these values will not be presented to the output until the code is changed wherever it is used, I'd say you're probably best doing it once with a stored procedure, which will dynamically generate the output you're looking for based on the schema of the input table.
I have demonstrated how this can be done, using the data you have supplied. This data is stored in a temp table (not #temp, because the stored proc won't work with temporary tables), populated thus:
CREATE TABLE temp (
_key int,
package_name varchar(50),
cost float,
included bit
)
INSERT INTO temp VALUES(1,'Package1', 10.00, 1)
INSERT INTO temp VALUES(2,'Package2', 20.00, 0)
INSERT INTO temp VALUES(3,'Package3', 20.00, 1)
The stored procedure retrieves a list of values based on the #pivot_field parameter, and uses these values as a column list to be inserted after the "Type" field. It then unions the pivot field and all other fields together to generate the rows, pivoting one column at a time. The procedure is as follows:
CREATE PROCEDURE usp_get_pivot (#table_name nvarchar(255), #pivot_field nvarchar(255)) AS
BEGIN
CREATE TABLE #temp (val nvarchar(max))
DECLARE #sql NVARCHAR(MAX), #cols NVARCHAR(MAX), #col NVARCHAR(255)
SET #sql = 'SELECT DISTINCT ' + #pivot_field + ' FROM ' + #table_name
INSERT INTO #temp EXEC sp_executesql #sql;
SET #cols = (SELECT '[' + val + '],' FROM #temp FOR XML PATH(''))
SET #cols = SUBSTRING(#cols, 1, LEN(#cols)-1)
SET #SQL = N'SELECT ''' + #pivot_field + ''' as [type], *
FROM (SELECT ' + #pivot_field + ', ' + #pivot_field + ' as ' + #pivot_field + '1 FROM ' + #table_name + ') AS source_table
PIVOT (max(' + #pivot_field + '1) FOR ' + #pivot_field + ' IN (' + #cols + ')) AS pivot_table'
DECLARE csr CURSOR FOR
SELECT c.name FROM sys.columns c, sys.objects o
WHERE c.object_id = o.object_id AND o.name = #table_name
AND c.name <> #pivot_field
ORDER BY column_id
OPEN csr
FETCH NEXT FROM csr INTO #col
WHILE ##FETCH_STATUS = 0
BEGIN
SET #sql = #sql + ' UNION ALL
SELECT ''' + #col + ''' as [type], *
FROM (SELECT ' + #pivot_field + ', CAST(' + #col + ' AS VARCHAR) AS ' + #col + ' FROM ' + #table_name + ') AS source_table
PIVOT (max(' + #col + ') FOR ' + #pivot_field + ' IN (' + #cols + ')) AS pivot_table'
FETCH NEXT FROM csr INTO #col
END
CLOSE csr
DEALLOCATE csr
DROP TABLE #temp
EXEC sp_executesql #sql
END
You should be able to simply copy and paste the procedure into management studio, create the data is shown above and execute the procedure with:
EXEC usp_get_pivot 'temp', 'package_name'
If number of packages is not static there is no option for you I think. PIVOT clause can produce only static/defined number of columns.
You may do some table-to-table rewriting using multiple statements - but still you have to face with static number of columns.
But you may set it to for example to 10 and then display up to 10 packages, having NULL-s in rest of columns if there are less packages.
You may also use dynamic SQL to have dynamic number of columns - but it will be a headache.
If you're going to export this data to Excel - do not pivot it at SQL - do a transposition in Excel (it's under "paste special").
Basically what i have at this stage is the following.
SELECT [Type],
MAX(Beginner) AS [Beginner],
MAX(Intermediate) AS [Intermediate],
MAX(Advanced) AS [Advanced]
FROM
(
SELECT
'Name' AS TYPE,
CASE WHEN Name='Beginner' THEN Name END AS [Beginner],
CASE WHEN Name='Intermediate' THEN Name END AS [Intermediate],
CASE WHEN Name='Advanced' THEN Name END AS [Advanced]
FROM Administration.Package
UNION ALL
SELECT
'Price' AS TYPE,
CASE WHEN Name='Beginner' THEN CAST(Price AS VARCHAR) END AS [Beginner],
CASE WHEN Name='Intermediate' THEN CAST(Price AS VARCHAR) END AS [Intermediate],
CASE WHEN Name='Advanced' THEN CAST(Price AS VARCHAR) END AS [Advanced]
FROM Administration.Package
)A
GROUP BY [Type]
But it does not feel right to have the union for each and every column.
I'm looking for a schema-independent query. That is, if I have a users table or a purchases table, the query should be equally capable of catching duplicate rows in either table without any modification (other than the from clause, of course).
I'm using T-SQL, but I'm guessing there should be a general solution.
I believe that this should work for you. Keep in mind that CHECKSUM() isn't 100% perfect - it's theoretically possible to get a false positive here (I think), but otherwise you can just change the table name and this should work:
;WITH cte AS (
SELECT
*,
CHECKSUM(*) AS chksum,
ROW_NUMBER() OVER(ORDER BY GETDATE()) AS row_num
FROM
My_Table
)
SELECT
*
FROM
CTE T1
INNER JOIN CTE T2 ON
T2.chksum = T1.chksum AND
T2.row_num <> T1.row_num
The ROW_NUMBER() is needed so that you have some way of distinguishing rows. It requires an ORDER BY and that can't be a constant, so GETDATE() was my workaround for that.
Simply change the table name in the CTE and it should work without spelling out the columns.
I'm still confused about what "detecting them might be" but I'll give it a shot.
Excluding them is easy
e.g.
SELECT DISTINCT * FROM USERS
However if you wanted to only include them and a duplicate is all the fields than you have to do
SELECT
[Each and every field]
FROM
USERS
GROUP BY
[Each and every field]
HAVING COUNT(*) > 1
You can't get away with just using (*) because you can't GROUP BY *
so this requirement from your comments is difficult
a schema-independent means I don't want to specify all of the columns
in the query
Unless that is you want to use dynamic SQL and read the columns from sys.columns or information_schema.columns
For example
DECLARE #colunns nvarchar(max)
SET #colunns = ''
SELECT #colunns = #colunns + '[' + COLUMN_NAME +'], '
FROM INFORMATION_SCHEMA.columns
WHERE table_name = 'USERS'
SET #colunns = left(#colunns,len(#colunns ) - 1)
DECLARE #SQL nvarchar(max)
SET #SQL = 'SELECT ' + #colunns
+ 'FROM USERS' + 'GROUP BY '
+ #colunns
+ ' Having Count(*) > 1'
exec sp_executesql #SQL
Please note you should read this The Curse and Blessings of Dynamic SQL if you haven't already
I have done this using CTEs in SQL Server.
Here is a sample on how to delete dupes but you should be able to adapt it easily to find dupes:
WITH CTE (COl1, Col2, DuplicateCount)
AS
(
SELECT COl1,Col2,
ROW_NUMBER() OVER(PARTITION BY COl1,Col2 ORDER BY Col1) AS DuplicateCount
FROM DuplicateRcordTable
)
DELETE
FROM CTE
WHERE DuplicateCount > 1
GO
Here is a link to an article where I got the SQL:
http://blog.sqlauthority.com/2009/06/23/sql-server-2005-2008-delete-duplicate-rows/
I recently was looking into the same issue and noticed this question.
I managed to solve it using a stored procedure with some dynamic SQL. This way you only need to specify the table name. And it will get all the other relevant data from sys tables.
/*
This SP returns all duplicate rows (1 line for each duplicate) for any given table.
to use the SP:
exec [database].[dbo].[sp_duplicates]
#table = '[database].[schema].[table]'
*/
create proc dbo.sp_duplicates #table nvarchar(50) as
declare #query nvarchar(max)
declare #groupby nvarchar(max)
set #groupby = stuff((select ',' + [name]
FROM sys.columns
WHERE object_id = OBJECT_ID(#table)
FOR xml path('')), 1, 1, '')
set #query = 'select *, count(*)
from '+#table+'
group by '+#groupby+'
having count(*) > 1'
exec (#query)
I've picked up some SQL similar to the following:
IF EXISTS(SELECT name FROM tempdb..sysobjects WHERE name Like N'#tmp%'
and id=object_id('tempdb..#tmp'))
DROP TABLE #tmp
into #tmp
select * from permTable
I need to add more data to #tmp before continuing processing:
insert into #tmp
select * from permTable2
But this gives errors because SQL has assumed sizes and types for #tmp columns (e.g. if permTable has a column full of ints but permTable2 has column with same name but with a NULL in one record you get "Cannot insert the value NULL into column 'IsPremium', table 'tempdb.dbo.#tmp").
How do I get #tmp to have the types I want? Is this really bad practise?
Have you considered creating a table var instead? You can declare the columns like such
declare #sometable table(
SomeField [nvarchar](15),
SomeOtherField [decimal](15,2));
This is why select into is a poor idea for your problem. Create the table structure specifically with a create table command and then write two insert statements.
It isn't possible.
If you need to generate the table definition typelist now,
create a view from the select statement, and read the columns and their definition from information_schema... (this work of art won't consider decimal and/or datetime2)
Note: this will give you the lowest possible field-length for varchar/varbinary columns you currently selected. You need to adjust them manually...
SELECT
','
+ COLUMN_NAME
+ ' '
+ DATA_TYPE
+ ' '
+ ISNULL
(
'('
+
CASE
WHEN CHARACTER_MAXIMUM_LENGTH = -1
THEN 'MAX'
ELSE CAST(CHARACTER_MAXIMUM_LENGTH AS varchar(36))
END
+ ')'
, ''
)
+ ' '
+ CASE WHEN IS_NULLABLE = 'NO' THEN 'NOT NULL' ELSE '' END
FROM information_schema.columns
WHERE table_name = '________theF'
ORDER BY ORDINAL_POSITION
And the field-list for the insert-statement:
SELECT
',' + COLUMN_NAME
FROM information_schema.columns
WHERE table_name = '________theF'
ORDER BY ORDINAL_POSITION