SSRS report table including Averages - sql

I need a report that has some data in it with calculation data among regular rows. For example:
Name | Age | Salary
HR | 35 | $1300
John | 30 | $1000
Mark | 40 | $1600
Law | 45 | $1500
Bill | 40 | $1000
Sara | 50 | $2000
The idea is to group rows by a field and then add a row with average numbers for this group.
Is it possible? I also have 2 date parameters (start and end), so I need to get all the records to SSRS and then filter them out...

Yes, this is possible and very straight forward.
Create your report with the data rows, then create a group on the Department field. You can do this a few ways: right click on the detail rows and select Add Group... or drag the department field to the Row groups pane in the design window.
Add a row to the group by right clicking on the details group and choosing to add a total, before the details. In the new row, set your formula to be =Avg(MyDataset!AgeFieldName.Value)
Take a look at the tutorials available on MSDN, especially the Grouping and Totals section

Related

Significance test for AB testing of email marketing campaign - data in MySQL

I have 50,000 email ids divided into Group A and Group B equally.
An email is sent to both groups with different subject.
The Open Rate is as follows:
Group | Emails_Opened | Total Emails | Open Rate
A | 24332 | 34471 | 70.5869%
B | 24020 | 33761 | 71.1427%
Which significance test should I run to make sure the results are statistically significant and how?
The data is in SQL database.
So you fill in the column totals and row totals, then each expected value is
<corresponding column total>*<corresponding row total>/<grand total>
e.g.
=B$4*$D2/$D$4
Then use
=CHISQ.TEST(B2:C3,F2:G3)

Structuring Month-Based Data in SQL

I'm curious about what the best way to structure data in a SQL database where I need to keep track of certain fields and how they differ month to month.
For example, if I had a users table in which I was trying to store 3 different values: name, email, and how many times they've logged in each month. Would it be best practice to create a new column for each month and store the number of times they logged in that month under that column? Or would it be better to create a new row/table for each month?
My instinct says creating new columns is the best way to reduce redundancy, however I can see it getting a little unwieldy when the number of columns in the table changes over time. (I was also thinking that if I were to do it by column, it would warrant having a total_column that keeps track of all months at a time).
Thanks!
In my opinion, the best approach is to store each login for each user.
Use a query to summarize the data the way you need it when you query it.
You should only be thinking about other structures if summarizing the detail doesn't meet performance requirements -- which for a monthly report don't seem so onerous.
Whatever you do, storing counts in separate columns is not the right thing to do. Every month, you would need to add another column to the table.
I'm not an expert but in my opinion, it is best to store data in a separate table (in your case). That way you can manipulate the data easily and you don't have to modify the table design in the future.
PK: UserID & Date or New Column (Ex: RowNo with auto increment)
+--------+------------+-----------+
| UserID | Date | NoOfTimes |
+--------+------------+-----------+
| 01 | 2018.01.01 | 1 |
| 01 | 2018.01.02 | 3 |
| 01 | 2018.01.03 | 5 |
| .. | | |
| 02 | 2018.01.01 | 2 |
| 02 | 2018.01.02 | 6 |
+--------+------------+-----------+
Or
PK: UserID, Year & Month or New Column (Ex: RowNo with auto increment)
+--------+------+-------+-----------+
| UserID | Year | Month | NoOfTimes |
+--------+------+-------+-----------+
| 01 | 2018 | Jan | 10 |
| 01 | 2018 | feb | 13 |
+--------+------+-------+-----------+
Before you create the table, please take a look at the database normalization. Especially 1st (1NF), 2nd (2NF) and 3rd (3NF) normalization forms.
https://www.tutorialspoint.com/dbms/database_normalization.htm
https://www.lifewire.com/database-normalization-basics-1019735
https://www.essentialsql.com/get-ready-to-learn-sql-database-normalization-explained-in-simple-english/
https://www.studytonight.com/dbms/database-normalization.php
https://medium.com/omarelgabrys-blog/database-normalization-part-7-ef7225150c7f
Either approach is valid, depending on query patterns and join requirements.
One row for each month
For a user, the row containing login count for the month will be inserted when data is available for the month. There will be 1 row per month per user. This design will make it easier to do joins by month column. However, multiple rows will need to be accessed to get data for a user for the year.
-- column list
name
email
month
login_count
-- example entries
'user1', 'user1#email.com','jan',100
'user2', 'user2#email.com','jan',65
'user1', 'user1#email.com','feb',90
'user2', 'user2#email.com','feb',75
One row for all months
You do not need to dynamically add columns, since number of months is known in advance. The table can be initially created to accommodate all months. By default, all month_login_count columns would be initialized to 0. Then, the row would be updated as the login count for the month is populated. There will be 1 row per user. This design is not the best for doing joins by month. However, only one row will need to be accessed to get data for a user for the year.
-- column list
name
email
jan_login_count
feb_login_count
mar_login_count
apr_login_count
may_login_count
jun_login_count
jul_login_count
aug_login_count
sep_login_count
oct_login_count
nov_login_count
dec_login_count
-- example entries
'user1','user1#email.com',100,90,0,0,0,0,0,0,0,0,0,0
'user2','user2#email.com',65,75,0,0,0,0,0,0,0,0,0,0

SQL payments matrix

I want to combine two tables into one:
The first table: Payments
id | 2010_01 | 2010_02 | 2010_03
1 | 3.000 | 500 | 0
2 | 1.000 | 800 | 0
3 | 200 | 2.000 | 300
4 | 700 | 1.000 | 100
The second table is ID and some date (different for every ID)
id | date |
1 | 2010-02-28 |
2 | 2010-03-01 |
3 | 2010-01-31 |
4 | 2011-02-11 |
What I'm trying to achieve is to create table which contains all payments before the date in ID table to create something like this:
id | date | T_00 | T_01 | T_02
1 | 2010-02-28 | 500 | 3.000 |
2 | 2010-03-01 | 0 | 800 | 1.000
3 | 2010-01-31 | 200 | |
4 | 2010-02-11 | 1.000 | 700 |
Where T_00 means payment in the same month as 'date' value, T_01 payment in previous month and so on.
Is there a way to do this?
EDIT:
I'm trying to achieve this in MS Access.
The problem is that I cannot connect name of the first table's column with the date in the second (the easiest way would be to treat it as variable)
I added T_00 to T_24 columns in the second (ID) table and was trying to UPDATE those fields
set T_00 =
iif(year(date)&"_"&month(date)=2010_10,
but I realized that that would be to much code for access to handle if I wanted to do this for every payment period and every T_xx column.
Even if I would write the code for T_00 I would have to repeat it for next 23 periods.
Your Payments table is de-normalized. Those date columns are repeating groups, meaning you've violated First Normal Form (1NF). It's especially difficult because your field names are actually data. As you've found, repeating groups are a complete pain in the ass when you want to relate the table to something else. This is why 1NF is so important, but knowing that doesn't solve your problem.
You can normalize your data by creating a view that UNIONs your Payments table.
Like so:
CREATE VIEW NormalizedPayments (id, Year, Month, Amount) AS
SELECT id,
2010 AS Year,
1 AS Month,
2010_01 AS Amount
FROM Payments
UNION ALL
SELECT id,
2010 AS Year,
2 AS Month,
2010_02 AS Amount
FROM Payments
UNION ALL
SELECT id,
2010 AS Year,
3 AS Month,
2010_03 AS Amount
FROM Payments
And so on if you have more. This is how the Payments table should have been designed in the first place.
It may be easier to use a date field with the value '2010-01-01' instead of a Year and Month field. It depends on your data. You may also want to add WHERE Amount IS NOT NULL to each query in the UNION, or you might want to use Nz(2010_01,0.000) AS Amount. Again, it depends on your data and other queries.
It's hard for me to understand how you're joining from here, particularly how the id fields relate because I don't see how they do with the small amount of data provided, so I'll provide some general ideas for what to do next.
Next you can join your second table with this normalized Payments table using a method similar to this or a method similar to this. To actually produce the result you want, include a calculated field in this view with the difference in months. Then, create an actual Pivot Table to format your results (like this or like this) which is the proper way to display data like your tables do.

vb.net and crystal report

I have a list of employee on a list view. In every employee there are list of task on each employee. I want to print in in crystal report per employee and their respective tasks. Once print button was click then I have a query to get all employees and loop to get their respective tasks. My problem now is the crystal report displaying. Please take a look sample result below I want to achieve. Thanks in advance.
Juan De La Cruz
Task | No of hours | Date
task 1 | 59 | 2014/01/05
task 2 | 60 | 2014/01/06
Justin Garcia
Task | No of hours | Date
task 1 | 35 | 2014/01/05
task 2 | 23 | 2014/01/06
In the report itself from (Field Explorer) select Group Name Fields, Select Group Expert and Select under Group By the groping criteria which in your case (Employee Id OR Employee Name) as your requirement so the report will show each employee information in separated section as you need in the question.

Best way to join the two tables *including* duplicates from one table

Accounts (table)
+----+----------+----------+-------+
| id | account# | supplier | RepID |
+----+----------+----------+-------+
| 1 | 123xyz | Boston | 2 |
| 2 | 245xyz | Chicago | 2 |
| 3 | 425xyz | Chicago | 3 |
+----+----------+----------+-------+
PayOut (table)
+----+----------+----------+-------------+--------+
| id | account# | supplier | datecreated | Amount |
+----+----------+----------+-------------+--------+
| 5 | 245xyz | Chicago | 01-15-2009 | 25 |
| 6 | 123xyz | Boston | 10-15-2011 | 50 |
| 7 | 123xyz | Boston | 10-15-2011 | -50 |
| 8 | 123xyz | Boston | 10-15-2011 | 50 |
| 9 | 425xyz | Chicago | 10-15-2011 | 100 |
+----+----------+----------+-------------+--------+
I have accounts table and I have payout table. Payout table comes from abroad so we do not have any control over it. This leaves us with a problem that we can't join the two tables based on record ID field, that is one problem which we can't solved. We therefore join based on Account#, SupplierID (2nd and 3rd column). This creates a problem that it creates (possibly) many to many relationship. But we filter our records if they are active and we use a second filter on payout table when the payout was created. Payout are created months to month. There are two problems with this in my view
The query takes quite a bit of time to complete (could be inefficient)
There are certain duplicates that are removed which should not be removed. Example is record 6 and 8 in payout table. What happened here is, we got a customer, then the customer cancelled then he got him back. In this case +50, -50 and +50. Again all values are valid and must show in the report for audit purposes. Currently only one +50 is shown, the other is lost. There are a couple of other problems within the report that comes once in a while.
Here is the query. It uses groups by to remove duplicates. I would like to have an advance query which outperforms and which does takes into account that no record in PayOut table is duplicated as long as they come up in the month of the report.
Here is our current query
/* Supplied to Store Procedure */
-----------------------------------
#RepID // the person for whome payout is calculated
#Month // of payment date
#year // year of payment date
-----------------------------------
select distinct
A.col1,
A.col2,
...
A.col10,
B.col2,
B.Col2,
B.Amount /* this is the important column, portion of which goes to Rep */
from records A
JOIN payout B
on A.Supplier = B.Supplier AND A.Account# = B.Account#
where datepart(mm, B.datecreated) = #Month /* parameter to stored procedure */
and datepart(yyyy, B.datecreated) = #Year
and A.[rep ID] = #RepID /* parameter to SP */
group by
col1,col2,col3,....col10
order by customerName
Is this query optimum? Can I improve it using CROSS APPLY or WHERE EXISTs that will make it faster as well as remove the duplicate problem?
Note that this query is used to get payout of a rep. Hence every record has repid field who it is assigned to. Ideally I would like to use Select WHERE Exist query.
It's difficult to understand exactly what you want because in one place you say you 'want' the duplicates but then you say that you are using the group by to remove duplicates. So the first thought would be "Why not just get rid of the group by?". But I have to believe you are smart enough to have thought of that yourself, so I assume it's got to be there for a reason.
I think someone here could help you pretty easily if you could post the actual query, but since you say you can't I will just try to give you some direction in solving the problem...
Instead of trying to do everything in one statement, use temporary tables or views to split it up. It may be easier for you to think about how to get rid of the duplicates you don't want and keep the ones you do first and put those into a temporary table, and then join the tables together and work with that.