Merge rows with same username in SSIS - sql

I have data origination from Active Directory in a flat file that i need to export to SQL server using SSIS. My challenge is that I want to do all the operations in SSIS and have the data that is exported into the database as the final output. My flat file has several rows bearing the same username that need to be combined into one row, and then concatenating the data in one column as in my illustration below:
Username Office LocationID Dept
-------- ---------- ---------- -----
1. btan HQ 01 Acct
2. cvill South 04 HR
3. cvill North 02 HR
4. btan East 03 Acct
5. cvill West 05 HR
6. lkays HQ 01 Legal
My output should be as follows and it should all be done using SSIS:
Username LocationID Dept
-------- ---------- -----
1. btan 01, 03 Acct
2. cvill 04, 02, 05 HR
6. lkays 01 Legal
Any help will be very much appreciate.

I support the prior suggestions that this is a bad data model, and I also support the SQL (non SSIS) solution. However if you must follow this path despite our warnings, take a look at the SSIS Pivot operator. You'll need to concatenate the resulting columns into one column.

Something like this will get you a comma delimited list of the IDs
SELECT Username, STUFF(IDList, 1,2,'') AS LocationID, Dept
FROM TableName T OUTER APPLY
(
SELECT ', ' + LocationID [text()]
FROM TableName
WHERE UserName = T.UserName
FOR XML PATH('')
) T2(IDList)

Related

Case Statement with 2 columns

If Dr's name is SoinSo, then make column "Clinic Number Column" say "SO" instead of "06"
This is just a select statement not an actual change to the database.
Not sure how to code this in SQL to get this specific output.
This is the current output:
Dr Name Column | Clinic Number Column
---------------+---------------------
Doe 06
SoinSo 06
James 06
This is the desired Output:
Dr Name Column | Clinic Number Column
---------------+---------------------
Doe 06
SoinSo SO
James 06
I've tried this, but couldn't find any documentation online about doing a CASE statement for 2 columns:
When stf.DrName='SoInSo' then pc.ClinicNumberColumn='SO'
You can get the desired output from following (not tested):
SELECT
stf.DrName
,CASE WHEN stf.DrName='SoInSo' THEN 'SO' ELSE pc.ClinicNumberColumn END AS "CLINIC"
FROM
TABLE

How can I add data from table to other in SQL?

I have 2 departments and created a new one, I want to update employee department but if that employees id equals one of id's listed in other table.
Employees
empid empname deptname
------------------------
01 john dept1
02 bill dept2
03 alex dept1
.
.
.
80 tomas dept1
New_depts_employees_id
empid
-----
02
05
45
18
20
34
78
80
55
32
If employee's id is inside the second table his depname will become 'dept3'
How can I write code make this process in SQL language (I using MS Access).
Do you want sql? You can use update and exists as follows:
Update employees
Set dept_name = 'dept3'
Where exists (select 1 from New_depts_employees_id n where n.emp_id = employees.emp_id)
Open new query constructor.
Add both tables to it.
Drag Employees.empid and drop it onto New_depts_employees_id.empid - a link occures. Not needed if the link is created automatically.
Change query type to UPDATE.
Set "Column to update" to Employees.deptname.
Set "Value to set" to 'dept3'.
Click "Execute".
You may save this query and convert static 'dept3' value to query parameter for future use from external application. Or you may open query constructor in SQL Mode and copy query text from it for external use.

How to load grouped data with SSIS

I have a tricky flat file data source. The data is grouped, like this:
Country City
U.S. New York
Washington
Baltimore
Canada Toronto
Vancouver
But I want it to be this format when it's loaded in to the database:
Country City
U.S. New York
U.S. Washington
U.S. Baltimore
Canada Toronto
Canada Vancouver
Anyone has met such a problem before? Got a idea to deal with it?
The only idea I got now is to use the cursor, but the it is just too slow.
Thank you!
The answer by cha will work, but here is another in case you need to do it in SSIS without temporary/staging tables:
You can run your dataflow through a Script Transformation that uses a DataFlow-level variable. As each row comes in the script checks the value of the Country column.
If it has a non-blank value, then populate the variable with that value, and pass it along in the dataflow.
If Country has a blank value, then overwrite it with the value of the variable, which will be last non-blank Country value you got.
EDIT: I looked up your error message and learned something new about Script Components (the Data Flow tool, as opposed to Script Tasks, the Control Flow tool):
The collection of ReadWriteVariables is only available in the
PostExecute method to maximize performance and minimize the risk of
locking conflicts. Therefore you cannot directly increment the value
of a package variable as you process each row of data. Increment the
value of a local variable instead, and set the value of the package
variable to the value of the local variable in the PostExecute method
after all data has been processed. You can also use the
VariableDispenser property to work around this limitation, as
described later in this topic. However, writing directly to a package
variable as each row is processed will negatively impact performance
and increase the risk of locking conflicts.
That comes from this MSDN article, which also has more information about the Variable Dispenser work-around, if you want to go that route, but apparently I mislead you above when I said you can set the value of the package variable in the script. You have to use a variable that is local to the script, and then change it in the Post-Execute event handler. I can't tell from the article whether that means that you will not be able to read the variable in the script, and if that's the case, then the Variable Dispenser would be the only option. Or I suppose you could create another variable that the script will have read-only access to, and set its value to an expression so that it always has the value of the read-write variable. That might work.
Yes, it is possible. First you need to load the data to a table with an IDENTITY column:
-- drop table #t
CREATE TABLE #t (id INTEGER IDENTITY PRIMARY KEY,
Country VARCHAR(20),
City VARCHAR(20))
INSERT INTO #t(Country, City)
SELECT a.Country, a.City
FROM OPENROWSET( BULK 'c:\import.txt',
FORMATFILE = 'c:\format.fmt',
FIRSTROW = 2) AS a;
select * from #t
The result will be:
id Country City
----------- -------------------- --------------------
1 U.S. New York
2 Washington
3 Baltimore
4 Canada Toronto
5 Vancouver
And now with a bit of recursive CTE magic you can populate the missing details:
;WITH a as(
SELECT Country
,City
,ID
FROM #t WHERE ID = 1
UNION ALL
SELECT COALESCE(NULLIF(LTrim(#t.Country), ''),a.Country)
,#t.City
,#t.ID
FROM a INNER JOIN #t ON a.ID+1 = #t.ID
)
SELECT * FROM a
OPTION (MAXRECURSION 0)
Result:
Country City ID
-------------------- -------------------- -----------
U.S. New York 1
U.S. Washington 2
U.S. Baltimore 3
Canada Toronto 4
Canada Vancouver 5
Update:
As Tab Alleman suggested below the same result can be achieved without the recursive query:
SELECT ID
, COALESCE(NULLIF(LTrim(a.Country), ''), (SELECT TOP 1 Country FROM #t t WHERE t.ID < a.ID AND LTrim(t.Country) <> '' ORDER BY t.ID DESC))
, City
FROM #t a
BTW, the format file for your input data is this (if you want to try the scripts save the input data as c:\import.txt and the format file below as c:\format.fmt):
9.0
2
1 SQLCHAR 0 11 "" 1 Country SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 100 "\r\n" 2 City SQL_Latin1_General_CP1_CI_AS

SQL Server : set all column aliases in a dynamic query

It's a bit of a long and convoluted story why I need to do this, but I will be getting a query string which I will then be executing with this code
EXECUTE sp_ExecuteSQL
I need to set the aliases of all the columns to "value". There could be a variable number of columns in the queries that are being passed in, and they could be all sorts of data types, for example
SELECT
Company, AddressNo, Address1, Town, County, Postcode
FROM Customers
SELECT
OrderNo, OrderType, CustomerNo, DeliveryNo, OrderDate
FROM Orders
Is this possible and relatively simple to do, or will I need to get the aliases included in the SQL queries (it would be easier not to do this, if it can be avoided and done when we process the query)
---Edit---
As an example, the output from the first query would be
Company AddressNo Address1 Town County Postcode
--------- --------- ------------ ------ -------- --------
Dave Inc 12345 1 Main Road Harlow Essex HA1 1AA
AA Tyres 12234 5 Main Road Epping Essex EP1 1PP
I want it to be
value value value value value value
--------- --------- ------------ ------ -------- --------
Dave Inc 12345 1 Main Road Harlow Essex HA1 1AA
AA Tyres 12234 5 Main Road Epping Essex EP1 1PP
So each of the column has an alias of "value"
I could do this with
SELECT
Company AS 'value', AddressNo AS 'value', Address1 AS 'value', Town AS 'value', County AS 'value', Postcode AS 'value'
FROM Customers
but it would be better (it would save additional complexity in other steps in the process chain) if we didn't have to manually alias each column in the SQL we're feeding in to this section of the process.
Regarding the XY problem, this is a tiny section in a very large process chain, it would take pages to explain the whole process in detail - in essence, we're taking code out of our database triggers and putting it into a dynamic procedure; then we will have frontends that users will access to "edit" the SQL statements that are called by the triggers and these will then dynamically feed the results out into other systems. It works if we manually alias the SQL going in, but it would be neater if there was a way we could feed clean SQL into the process and then apply the aliases when the SQL is processed - it would keep us DRY, to start with.
I do not understand at all what you are trying to accomplish, but I believe the answer is no: there is no built-in way how to globally predefine or override column aliases for ad hoc queries. You will need to code it yourself.

oracle - sql query select max from each base

I'm trying to solve this query where i need to find the the top balance at each base. Balance is in one table and bases are in another table.
This is the existing query i have that returns all the results but i need to find a way to limit it to 1 top result per baseID.
SELECT o.names.name t.accounts.bidd.baseID, MAX(t.accounts.balance)
FROM order o, table(c.accounts) t
WHERE t.accounts.acctype = 'verified'
GROUP BY o.names.name, t.accounts.bidd.baseID;
accounts is a nested table.
this is the output
Name accounts.BIDD.baseID MAX(T.accounts.BALANCE)
--------------- ------------------------- ---------------------------
Jerard 010 1251.21
john 012 3122.2
susan 012 3022.2
fin 012 3022.2
dan 010 1751.21
What i want the result to display is calculate the highest balance for each baseID and only display one record for that baseID.
So the output would look only display john for baseID 012 because he has the highest.
Any pointers in the right direction would be fantastic.
I think the problem is cause of the "Name" column. since you have three names mapped to one base id(12), it is considering all three records as unique ones and grouping them individually and not together.
Try to ignore the "Name" column in select query and in the "Group-by" clause.
SELECT t.accounts.bidd.baseID, MAX(t.accounts.balance)
FROM order o, table(c.accounts) t
WHERE t.accounts.acctype = 'verified'
GROUP BY t.accounts.bidd.baseID;