Datatable Compare Rows - vb.net

I have a datatable object, which is populated from a webservice.
Apparently, the web service just throws everything (data) back to me. The data which gets in my datatable looks like this:
Dept Code Value
Science ABC 5
Science ABC 6
Science DEF 7
Math ABC 3
Math DEF 9
English ABC 2
English DEF 3
English DEF 4
English DEF 5
Now, I want to create a datatable that will calculate (and sum)/ eliminate the values in the datatable, so that the new datatable would have the data like:
Dept Code Value
Science ABC 11
Science DEF 7
Math ABC 3
Math DEF 9
English ABC 2
English DEF 12
Please take note that I could only modify the datatable.
Can anyone help me? VB.Net please. Thanks.

A simple summary query will give you what you want:
SELECT Dept, Code, SUM(Value) sum_value FROM datatable GROUP BY Dept, Code
You could also create a view with that SQL definition, so you could
just query the view as you would a table. If you start to get so much
data that the query is slow, you'll want to store the results in a
permanent table - but for moderate amounts of data this should work fine.

Related

SQL Server 2008. Take info from two tables and concatenate row values. [duplicate]

This question already has answers here:
Simulating group_concat MySQL function in Microsoft SQL Server 2005?
(12 answers)
Closed 4 years ago.
I have looked at what was marked as the duplicate of this and it is not. I'm pulling from two tables, not one.
First, allow me to say I had nothing to do with the design of this database.
I have two tables that must be joined, and then an unknown amount of rows where the data must be concatenated into one giant string. They are joined by the Record ID.
Item table:
Item RecordID
---------------------
Car A 123
Car B 456
Car C 789
Yes, the words literally cut off in the middle. There should be nothing added between the values, and I also need to keep the commas and other special characters.
Details table:
RecordID Details
--------------------------------
123 black pain
123 t, radials
123 , green le
123 ather, spo
123 rt steerin
123 g wheel, b
123 uilt-in GP
123 S
456 standard
789 black leat
789 her, teles
789 coping ste
789 ering whee
789 l, seven c
789 up holders
789 , heavy du
789 ty mudflap
789 s
What I want to end up with is this:
ItemID RecordID Details
----------------------------------------------------------------------------
Car A 123 black paint, radials, green leather, sport steering wheel, built-in GPS
Car B 456 standard
Car C 789 black leather, telescoping steering wheel, seven cup holders, heavy duty mudflaps
I've looked at all the XML ones and can't figure out how to do this.
Thanks in advance.
There is no guarantee that your STUFF/ FOR XML PATH will produce the results you ask for unless you have an IDENTITY field in your Details table or some other value that you can sort by that will force the order of the Details text.
Usually you could use the STUFF command with an ORDER BY statement
SELECT
Item.Item AS ItemID,
Item.RecordID,
STUFF( (
SELECT
'' + Details
FROM
Details
WHERE
Details.RecordID = Item.RecordID
-- ORDER BY SomeLineIndicator
FOR XML PATH ('')
), 1, 0, '' ) AS Details
FROM
Item
I tried this on my box without the ORDER BY and just so happened to get the result you're asking for, but you really can't rely on these results without a field you can use to force the order.
Please read this post and the linked articles for more information about why you'd need a field for this and why you can't depend on an undetermined internal "index" to take care of it for you: Default row order in SELECT query - SQL Server 2008 vs SQL 2012

remove hard coded values from view

I have a table, tblSwaps. It looks a bit like below.
swapdate region bb_ticker swap_units
2017-01-01 EU ABC 10
2017-01-01 US ABC 40
2017-01-01 EU DEF 13
2017-01-01 US DEF 12
2017-02-20 EU ABC 8
2017-02-20 US ABC 40
2017-02-20 EU DEF 13
2017-02-20 US DEF 12
I also have another table, tblCodes
code
ABC
DEF
I then have a view, vw_SwapTotal, query below. This basically sums the number of swap units for each bb_ticker for a certain date.
SELECT swapdate, bb_ticker, SUM(swap_units) AS total_swap_units
FROM tblSwaps
GROUP BY swapdate, bb_ticker
I have created another view (this is where I have a question) shown below, which makes use of the view above (not sure if this is the best idea or not). The problem which the query below is that I have hard coded the bb_tickers (ABC, DEF) in which is not ideal as in the future there will be new bb_tickers.
SELECT *
FROM vw_SwapTotal
WHERE bb_ticker IN ('ABC', 'DEF'))
SELECT swapdate, isnull(ABC, 0) ABC, isnull(DEF, 0) DEF
FROM swp AS source PIVOT (max(total_swap_units) FOR bb_ticker IN ([ABC], [DEF])) AS pvt
What is the best way to get rid of the hard coded bb_tickers in this view?
Could this be what you want?
WHERE bb_ticker IN (select distinct bb_ticker from tblSwaps))

When using vb.net I need to join 2 datatables to just 1 gridview

I have been searching through a lot of different forums, but havent found the help I am looking for, so here we go.
First of all I should inform you that I am well aware, that 1 solution could be to do a SQL join in my sql statement, however this is not so easy as I am using 2 different tables from 2 different databases. So I am interesting in hearing in another solution.
As it is now, I have made 2 queries, and made 2 datatables, and 2 gridviews.
I have succeeded in binding the data in the 2 gridviews, and now I kinda want to "merge" them based on 1 column that they share.
How to do this in VB.net I don't know.
Basically I have 1 table in the database dbo_db_Test_Palle on the server OKPalle. From this I take the following columns and put into my datatable dt and bind to gridview 1.
| Name | ID | Organisation |
In another datatable (dt2) from dbo_db_Test_Palle2 from the server NOTokPalle I take the following columns and put into dt2, and bind to Gridview 2.
| Nickname | City | Hobby | ID
What I would like to show in just one GridView is:
| Name | ID | Organisation | Nickname | City | Hobby |
So basically I wish to add City and Hobby columns, from dt2 to dt, where the posts with ID matches (others I just leave blank).
I really hope someone out here can help me.
Something like this, but it's been a while since I've done this, you might need to hack around with it a little.
Dim ds As New DataSet
ds.Tables.Add(dt)
ds.Tables.Add(dt2)
ds.Relations.Add("rel", dt2.Columns("ID"), dt.Columns("ID"), False)
dt.Columns.Add("Nickname", GetType(System.String), "Parent.Nickname")
dt.Columns.Add("City", GetType(System.String), "Parent.City")
dt.Columns.Add("Hobby", GetType(System.String), "Parent.Hobby")

Use Access SQL to do a grouped ranking

How do I rank salespeople by # customers grouped by department (with ties included)?
For example, given this table, I want to create the Rank column on the right. How should I do this in Access?
SalesPerson Dept #Customers Rank
Bill DeptA 20 1
Ted DeptA 30 2
Jane DeptA 40 3
Bill DeptB 50 1
Mary DeptB 60 2
I already know how to do a simple ranking with this SQL code. But I don't know how to rework this to accept grouping.
Select Count(*) from [Tbl] Where [#Customers] < [Tblx]![#Customers] )+1
Also, there's plenty of answers for this using SQL Server's Rank() function, but I need to do this in Access. Suggestions, please?
SELECT *, (select count(*) from tbl as tbl2 where
tbl.customers > tbl2.customers and tbl.dept = tbl2.dept) + 1 as rank from tbl
Just add the dept field to the subquery...
Great solution with subquery! Except for huge recordsets, the subquery solution gets very slow. Its better(quicker) to use a Self JOIN, look at the folowing solution: self join
SELECT tbl1.SalesPerson , count(*) AS Rank
FROM tbl AS tbl1 INNER JOIN tbl AS tbl2 ON tbl1.DEPT = tbl2.DEPT
AND tbl1.#Customers < tbl2.#Customers
GROUP BY tbl1.SalesPerson
I know this is an old thread. But since I spent a great deal of time on a very similar problem and was greatly helped by the former answers given here, I would like to share what I have found to be a MUCH faster way. (Beware, it is more complicated.)
First make another table called "Individualizer". This will have one field containing a list of numbers 1 through the-highest-rank-that-you-need.
Next create a VBA module and paste this into it:
'Global Declarations Section.
Option Explicit
Global Cntr
'*************************************************************
' Function: Qcntr()
'
' Purpose: This function will increment and return a dynamic
' counter. This function should be called from a query.
'*************************************************************
Function QCntr(x) As Long
Cntr = Cntr + 1
QCntr = Cntr
End Function
'**************************************************************
' Function: SetToZero()
'
' Purpose: This function will reset the global Cntr to 0. This
' function should be called each time before running a query
' containing the Qcntr() function.
'**************************************************************
Function SetToZero()
Cntr = 0
End Function
Save it as Module1.
Next, create Query1 like this:
SELECT Table1.Dept, Count(Table1.Salesperson) AS CountOfSalesperson
FROM Table1
GROUP BY Table1.Dept;
Create a MakeTable query called Query2 like this:
SELECT SetToZero() AS Expr1, QCntr([ID]) AS Rank, Query1.Dept,
Query1.CountOfSalesperson, Individualizer.ID
INTO Qtable1
FROM Query1
INNER JOIN Individualizer
ON Query1.CountOfSalesperson >= Individualizer.ID;
Create another MakeTable query called Query3 like this:
SELECT SetToZero() AS Expr1, QCntr([Identifier]) AS Rank,
[Salesperson] & [Dept] & [#Customers] AS Identifier, Table1.Salesperson,
Table1.Dept, Table1.[#Customers]
INTO Qtable2
FROM Table1;
If you have another field already that uniquely identifies every row you wouldn't need to create an Identifier field.
Run Query2 and Query3 to create the tables.
Create a fourth query called Query4 like this:
SELECT Qtable2.Salesperson, Qtable2.Dept, Qtable2.[#Customers], Qtable1.ID AS Rank
FROM Qtable1
INNER JOIN Qtable2 ON Qtable1.Rank = Qtable2.Rank;
Query4 returns the result you are looking for.
Practically, you would want to write a VBA function to run Query2 and Query3 and then call that function from a button placed in a convenient location.
Now I know this sounds ridiculously complicated for the example you gave. But in real life, I am sure your table is more complicated than this. Hopefully my examples can be applied to your actual situation. In my database with over 12,000 records this method is by FAR the fastest (as in: 6 seconds with 12,000 records compared to over 1 minute with 262 records ranked with the subquery method).
The real secret for me was the MakeTable query because this ranking method is useless unless you immediately output the results to a table. But, this does limit the situations that it can be applied to.
P.S. I forgot to mention that in my database I was not pulling results directly from a table. The records had already gone through a string of queries and multiple calculations before they needed to be ranked. This probably contributed greatly to the huge difference in speed between the two methods in my situation. If you are pulling records directly from a table, you might not notice nearly as big an improvement.
You need to do some math. I typically take advantage of the combination of a counter field and an "offset" field. You're aiming for a table which looks like this (#Customers isn't necessary, but will give you a visual that you're doing it properly):
SalesPerson Dept #Customers Ctr Offset
Bill DeptA 20 1 1
Ted DeptA 30 2 1
Jane DeptA 40 3 1
Bill DeptB 50 4 4
Mary DeptB 60 5 4
So, to give rank, you'd do [Ctr]-[Offset]+1 AS Rank
build a table with SalesPerson, Dept, Ctr, and Offset
insert into that table, ordered by Dept and #Customers (so that they're all sorted properly)
Update Offset to be the MIN(Ctr), grouping on Dept
Perform your math calculation to determine Rank
Clear out the table so you're ready to use it again next time.
To add to this and any other related Access Ranking or Rank Tie Breaker how-tos for other versions of Access, ranking should not be performed on crosstab queries if your FROM clause happens to NOT contain a table but a query that is either a crosstab query or a query that contains within it elsewhere a crosstab query.
The code referenced above where a SELECT statement within a SELECT statment is used (sub query),
"SELECT *, (select count(*) from tbl as tbl2 where tbl.customers > tbl2.customers and tbl.dept = tbl2.dept) + 1 as rank from tbl"
will not work and will always fail expressing a error on portion of the code where "tbl.customers > tbl2.customers" cannot be found.
In my situation on a past project, I was referencing a query instead of a table and within that query I had referenced a crosstab query thus failing and producing an error. I was able to resolve this by creating a table from the crosstab query first, and when I referenced the newly created table in the FROM clause, it started working for me.
So in final, normally you can reference a query or table in the FROM clause of the SELECT statement as what was shared previously above to do ranking, but be carefull as to if you are referencing a query instead of a table, that query must Not be a crosstab query or reference another query that is a crosstab query.
Hope this helps anyone else that may have had problems looking for a possible reason if you happen to reference the statements above and you are not referencing a table in your FROM clause within your own project. Also, performing subqueries on aliases with crosstab queries in Access probably isn't good idea or best practice either so stray away from that if/when possible.
If you found this useful, and wish that Access would allow the use of a scrolling mouse in a passthru query editor, give me a like please.
I normally pick tips and ideas from here and sometimes end up building amazing things from it!
Today, (well let’s say for the past one week), I have been tinkering with Ranking of data in Access and to the best of my ability, I did not anticipate what I was going to do something so complex as to take me a week to figure it out! I picked titbits from two main sites:
https://usefulgyaan.wordpress.com/2013/04/23/ranking-in-ms-access/ (seen that clever ‘>=’ part, and the self joins? Amazing… it helped me to build my solution from just one query, as opposed to the complex method suggested above by asonoftheMighty (not discrediting you… just didn’t want to try it for now; may be when I get to large data I might want to try that as well…)
Right here, from Paul Abott above ( ‘and tbl.dept = tbl2.dept’)… I was lost after ranking because I was placing AND YearID = 1, etc, then the ranking would end up happening only for sub-sets, you guessed right, when YearID = 1! But I had a lot of different scenarios…
Well, I gave that story partly to thank the contributors mentioned, because what I did is to me one of the most complex of the ranking that I think can help you in almost any situation, and since I benefited from others, I would like to share here what I hope may benefit others as well.
Forgive me that I am not able to post my table structures here, it is a lot of related tables. I will only post the query, so if you need to you may develop your tables to end up with that kind of query. But here is my scenario:
You have students in a school. They go through class 1 to 4, can either be in stream A or B, or none when the class is too small. They each take 4 exams (this part is not important now), so you get the total score for my case. That’s it. Huh??
Ok. Lets rank them this way:
We want to know the ranking of
• all students who ever passed through this school (best ever student)
• all students in a particular academic year (student of the year)
• students of a particular class (but remember a student will have passed through all classes, so basically his/her rank in each of those classes for the different years) this is the usual ranking that appears in report cards
• students in their streams (above comment applies)
• I would also like to know the population against which we ranked this student in each category
… all in one table/query. Now you get the point?
(I normally like to do as much of my 'programming' in the database/queries to give me visuals and to reduce the amount of code I will later have to right. I actually won't use this query in my application :), but it let's me know where and how to send my parameters to the query it came from, and what results to expect in my rdlc)
Don't you worry, here it is:
SELECT Sc.StudentID, Sc.StudentName, Sc.Mark,
(SELECT COUNT(Sch.Mark) FROM [StudentScoreRankTermQ] AS Sch WHERE (Sch.Mark >= Sc.Mark)) AS SchoolRank,
(SELECT Count(s.StudentID) FROM StudentScoreRankTermQ AS s) As SchoolTotal,
(SELECT COUNT(Yr.Mark) FROM [StudentScoreRankTermQ] AS Yr WHERE (Yr.Mark >= Sc.Mark) AND (Yr.YearID = Sc.YearID) ) AS YearRank,
(SELECT COUNT(StudentID) FROM StudentScoreRankTermQ AS Yt WHERE (Yt.YearID = Sc.YearID) ) AS YearTotal,
(SELECT COUNT(Cl.Mark) FROM [StudentScoreRankTermQ] AS Cl WHERE (Cl.Mark >= Sc.Mark) AND (Cl.YearID = Sc.YearID) AND (Cl.TermID = Sc.TermID) AND (Cl.ClassID=Sc.ClassID)) AS ClassRank,
(SELECT COUNT(StudentID) FROM StudentScoreRankTermQ AS C WHERE (C.YearID = Sc.YearID) AND (C.TermID = Sc.TermID) AND (C.ClassID = Sc.ClassID) ) AS ClassTotal,
(SELECT COUNT(Str.Mark) FROM [StudentScoreRankTermQ] AS Str WHERE (Str.Mark >= Sc.Mark) AND (Str.YearID = Sc.YearID) AND (Str.TermID = Sc.TermID) AND (Str.ClassID=Sc.ClassID) AND (Str.StreamID = Sc.StreamID) ) AS StreamRank,
(SELECT COUNT(StudentID) FROM StudentScoreRankTermQ AS St WHERE (St.YearID = Sc.YearID) AND (St.TermID = Sc.TermID) AND (St.ClassID = Sc.ClassID) AND (St.StreamID = Sc.StreamID) ) AS StreamTotal,
Sc.CalendarYear, Sc.Term, Sc.ClassNo, Sc.Stream, Sc.StreamID, Sc.YearID, Sc.TermID, Sc.ClassID
FROM StudentScoreRankTermQ AS Sc
ORDER BY Sc.Mark DESC;
You should get something like this:
+-----------+-------------+------+------------+-------------+----------+-----------+-----------+------------+------------+-------------+------+------+-------+--------+
| StudentID | StudentName | Mark | SchoolRank | SchoolTotal | YearRank | YearTotal | ClassRank | ClassTotal | StreamRank | StreamTotal | Year | Term | Class | Stream |
+-----------+-------------+------+------------+-------------+----------+-----------+-----------+------------+------------+-------------+------+------+-------+--------+
| 1 | Jane | 200 | 1 | 20 | 2 | 12 | 1 | 9 | 1 | 5 | 2017 | I | 2 | A |
| 2 | Tom | 199 | 2 | 20 | 1 | 12 | 3 | 9 | 1 | 4 | 2016 | I | 1 | B |
+-----------+-------------+------+------------+-------------+----------+-----------+-----------+------------+------------+-------------+------+------+-------+--------+
Use the separators | to reconstruct the result table
Just an idea about the tables, each student will be related to a class. Each class relates to years. Each stream relates to a class. Each term relates to a year. Each exam relates to a term and student and a class and a year; a student can be in class 1A in 2016 and moves on to class 2b in 2017, etc…
Let me also add that this a beta result, I have not tested it well enough and I do not yet have an opportunity to create a lot of data to see the performance. My first glance at it told me that it is good. So if you find reasons or alerts you want to point my way, please do so in comments so I may keep learning!

Anonymise SQLite database?

Problem: I have a table with first name, surname, and gender columns. I need to partially anonymise the database, by replacing all the names in this table with arbitrary made-up names. I also have a spreadsheet with lots of gender-specific arbitrary names.
Given this, how do I iterate through the rows of this table, and replace each name in turn with a name from the spreadsheet?
I can do this in C fairly trivially, but it's a days work - export the spreadsheet as CSV, and then iterate through the rows of the table, updating each name with the next one from the CSV file. However, I can't help feeling that there's a much simpler way to do this by turning the CSV name data into a script, but I've got no idea how to iterate through the table from a script. Any pointers/ideas appreciated.
I believe you are on the right track with the application route either with C or Python or whatever you feel convenient. Here is a different method that can be scripted.
Export data from Excel as CSV
$ cat test.csv
Jacob Jacobs,M
Rogers Bogers,M
Marsha Darsha,F
Tina Fina,F
Mono Bono,M
Import this into sqlite
sqlite> .mode csv
sqlite> .import test.csv proxy
sqlite> select * from proxy2;
"Jacob Jacobs",M
"Rogers Bogers",M
"Marsha Darsha",F
"Tina Fina",F
"Mono Bono",M
Remember count of males and females
Let's say your table was called main in which you have real names, and you want to change them to names from proxy table randomly.
sqlite> .schema
CREATE TABLE proxy (fullname text, gender text);
CREATE TABLE main(fullname TEXT,gender TEXT,age INT);
sqlite> select * from main;
fullname,gender,age
"John Smith",M,20
"Marshall Dubin",M,20
"Kate Ortiz",F,20
"Ron Bunsh",M,20
"Kelly Torro",F,20
sqlite> select count(*) from main where gender='M';
count(*)
3
sqlite> select count(*) from main where gender='F';
count(*)
2
Have your application remember this information that there are 3 Males and 2 Females.
Execute update statement repeatedly with different offset
sqlite> update main
...> set fullname = (
...> select fullname from proxy where gender='M' order by random() limit 1)
...> where rowid = (
...> select rowid from main where gender='M' order by rowid limit 0,1);
Change the limit 0,1 to limit 1,1 and re-execute. Go on till you reach limit 2,1. Since you have 3 records for Males, go from limit 0,1 to limit 2,1.
Repeat the same thing to anonymize Female records. Change gender='M' to gender='F'. Since there are only 2 females, you will execute update two times. Once with limit 0,1 and then with limit 1,1.
If you run this in a transaction, my hope that your script should be able to churn through the updates quite fast.
End Result
WAS
fullname gender age
---------- ---------- ----------
John Smith M 20
Marshall D M 20
Kate Ortiz F 20
Ron Bunsh M 20
Kelly Torr F 20
IS
fullname gender age
------------- ---------- ----------
Rogers Bogers M 20
Jacob Jacobs M 20
Tina Fina F 20
Jacob Jacobs M 20
Jasmine F 20
Example of scripting SQLite with Bash - http://andreaolivato.tumblr.com/post/133473114/using-sqlite3-in-bash
Other option
In your application, hold the fake names in two arrays - one for male and one for female. The idea is to be able to pull a random fake name by gender on demand
Do a select rowid, gender from main order by rowid
Iterate through the records
If gender is male, pull a random fake record from male array; likewise for female record
Run update main set fullname=<fake-record> where rowid=<selected-row-id>