I changed a but the context, but it's basically the same issue.
Imagine we are in a never-ending tunnel, shaped like a circle. We split every section of the circle, from 1 to 10 and we'll call each section slot (sl). There are 2 groups (gr) of living things walking in the tunnel. Each group has 2 bands, where each has a name and global hitpoints (hp). Every group is walking forward (although the bands might change order). If a group is at slot #10 and moves forward, he will be at slot #1. We snapshot their information every day. All the data gathered is stored in a table with this structure:
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
| day_id | | gr_1_sl_1_id | | gr_1_sl_1_name | | gr_1_sl_1_hp | | gr_1_sl_2_id | | gr_1_sl_2_name | | gr_1_sl_2_hp | | gr_2_sl_1_id | | gr_2_sl_1_name | | gr_2_sl_1_hp | | gr_2_sl_2_id | | gr_2_sl_2_name | | gr_2_sl_2_hp | |
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
| 1 | 3 | orc | 100 | 4 | goblin | 10 | 10 | human | 50 | 1 | dwarf | 25 | |
| 2 | 6 | goblin | 7 | 7 | orc | 76 | 2 | human | 60 | 3 | dwarf | 28 | |
+----------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+----------------+----------------+------------------+--------------+--+
As you can see, the columns are structured in a sequential way, while the data shows what is the actual value. What I want is to have the information shaped this way instead:
+---------+-------+-------+-----------+---------+
| id_game | gr_id | sl_id | band_name | band_hp |
+---------+-------+-------+-----------+---------+
| 1 | 1 | 3 | orc | 100 |
| 1 | 1 | 4 | goblin | 10 |
| 1 | 2 | 10 | human | 50 |
| 1 | 2 | 1 | dwarf | 25 |
| 2 | 1 | 6 | goblin | 7 |
| 2 | 1 | 7 | orc | 76 |
| 2 | 2 | 2 | human | 60 |
| 2 | 2 | 3 | dwarf | 28 |
+---------+-------+-------+-----------+---------+
I have this information in power bi, although I can create views in sql server if need be. I have tried many things, closest thing I got was unpivoting and parsing the original columns to get day_id, gr_id, sl_id, attributes and values. In attributes and values, it's basically name and hp with their corresponding value (I changed hp into string), but then I'm stocked, I'm not sure what to do next.
Anyone has any ideas ? Keep in mind that I oversimplified the problem; there are more groups, more slots, more bands and more statistics (i.e. attack and defense rating, etc.)
You seem to want to unpivot the table. In SQL Server, I recommend using apply:
select t.day_id, v.*
form t cross apply
(values (1, 1, gr_1_sl_1_id, gr_1_sl_1_name, gr_1_sl_1_hp),
(1, 2, gr_1_sl_2_id, gr_1_sl_2_name, gr_1_sl_2_hp),
(2, 1, gr_2_sl_1_id, gr_1_sl_1_name, gr_2_sl_1_hp),
(2, 2, gr_2_sl_2_id, gr_1_sl_2_name, gr_2_sl_2_hp)
) v(id_game, gr_id, sl_id, band_name, band_hp);
In other databases, you can do something similar with union all.
Related
I'm designing a database for a workout tracker app. Each user should be able to track multiple workouts (routines). A workout can have multiple exercises an exercise can be used in many workouts. Each exercise will have a specific track type (weight and reps, distance and time, only reps).
My tables so far:
| User | |
|------|-------|
| id | name |
| 1 | Ilka |
| 2 | James |
| Exercise | | |
|----------|---------------------|---------------|
| id | name | track_type_id |
| 1 | Barbell Bench Press | 1 |
| 2 | Squats | 1 |
| 3 | Deadlifts | 1 |
| 4 | Rowing Machine | 3 |
| Workout | | |
|---------|---------|-----------------|
| id | user_id | name |
| 1 | 1 | Chest & Triceps |
| 2 | 1 | Legs |
| Workout_Exerice (Junction table) | |
|-----------------|------------------|------------|
| id | exersice_id | workout_id |
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 4 | 1 |
| Workout_Sets | | | |
|--------------|---------------------|------|--------|
| id | workout_exersice_id | reps | weight |
| 1 | 1 | 12 | 120 |
| 2 | 1 | 10 | 120 |
| 3 | 1 | 8 | 120 |
| 4 | 2 | 10 | 220 |
| 5 | 3 | null | null |
| TrackType | |
|-----------|-----------------|
| id | name |
| 1 | Weight and Reps |
| 2 | Reps Only |
| 3 | Distance Time |
My issue is how to incorporate the TrackType table for each workout set, my first option was to create columns in the Workout_Sets table for each tracking type (weight and reps, distance and time, only reps) but that means for many rows I will have many nulls. Another option I thought was to use an EAV type table but I'm not sure. Also do you think my design is efficient (Over-normalization)?
I would say that the most efficient way is to have nulls in your table. The alternative would require you to split many of the category's into separate tables. Also a recommendation is that you start factoring a User ID table into your database
Your description states that “Each exercise will have a specific track type” suggesting a one-to-one relationship between Exercise and TrackType, and that the relationship is unchanging. As such, the exercise table should have a TrackType column.
I suspect, however, that your problem description may be lacking specificity, making it difficult to give you sound advice. For instance, if the TrackType can vary for any given exercise, your TrackType column may belong on the Workout_Sets table. If the relationship between TrackType and Exercise/Workout_Sets is many-to-many, then you will need another junction table.
Your question regarding “over-normalization” depends upon many factors that are specific to your solution. In general, I would say no - the degree of normalization appears to be appropriate.
I am using Teradata SQL Assistant Version TD 16.10.06.01 ...
I have seen a lot people transpose data for set smallish tables but I am working on thousands of clients and need the break the columns up into Line Item Values to compare orders/highlight differences between orders. Problem is it is all horizontally linked and I need to transpose it to Id,Transaction id,Version and Line Item Value 1, Line Item Value 2... then another column comparing values to see if they changed.
example:
+----+------------+-----------+------------+----------------+--------+----------+----------+------+-------------+
| Id | First Name | Last Name | DOB | transaction id | Make | Location | Postcode | Year | Price |
+----+------------+-----------+------------+----------------+--------+----------+----------+------+-------------+
| 1 | John | Smith | 15/11/2001 | 1654654 | Audi | NSW | 2222 | 2019 | $ 10,000.00 |
| 2 | Mark | White | 11/02/2002 | 1661200 | BMW | WA | 8888 | 2016 | $ 8,999.00 |
| 3 | Bob | Grey | 10/05/2002 | 1667746 | Ford | QLD | 9999 | 2013 | $ 3,000.00 |
| 4 | Phil | Faux | 6/08/2002 | 1674292 | Holden | SA | 1111 | 2000 | $ 5,800.00 |
+----+------------+-----------+------------+----------------+--------+----------+----------+------+-------------+
hoping to change the data to :
+----+----------+----------+----------+----------------+----------+----------+----------------+---------+-----+
| id | trans_id | Vers_ord | Item Val | Ln_Itm_Dscrptn | Org_Val | Updt_Val | Amndd_Ord_chck | Lbl_Rnk | ... |
+----+----------+----------+----------+----------------+----------+----------+----------------+---------+-----+
| 1 | 1654654 | 2 | 11169 | Make | Audi BLK | Audi WHT | Yes | 1 | |
| 1 | 1654654 | 2 | 11189 | Location | NSW | WA | Yes | 2 | |
| 1 | 1654654 | 2 | 23689 | Postcode | 2222 | 6000 | Yes | 3 | |
+----+----------+----------+----------+----------------+----------+----------+----------------+---------+-----+
Recently with smaller data I created a table added in Values then used a case statement when value 1 then xyz with a product join ... and the data warehouse admins didn't mention anything out of order. but I only had row 16 by 200 column table to transpose ( Sum, Avg, Count, Median(function) x 4 subsets of clients) , which were significantly smaller than my current tables to make comparisons with.
I am worried my prior method will probably slow the data Warehouse down, plus take me significant amount of time to type the SQL.
Is there a better way to transpose large tables?
I have a master table (Project List) along with several sub tables that are joined on one common field (RecNum). I need to get totals for all of the sub tables, by column and am not sure how to do it. This is a sample of the table design. There are more columns in each table (I need to pull * from "Project List") but I'm showing a sampling of the column names and values to get an idea of what to do.
Project List
| RecNum | Project Description |
| 6 | Sample description |
| 7 | Another sample |
WeekA
| RecNum | UserName | Day1Reg | Day1OT | Day2Reg | Day2OT | Day3Reg | Day3OT |
| 6 | JustMe | 1 | 2 | 3 | 4 | 5 | 6 |
| 6 | NotMe | 1 | 2 | 3 | 4 | 5 | 6 |
| 7 | JustMe | | | | | | |
| 7 | NotMe | | | | | | |
WeekB
| RecNum | UserName | Day1Reg | Day1OT | Day2Reg | Day2OT | Day3Reg | Day3OT |
| 6 | JustMe | 7 | 8 | 1 | 2 | 3 | 4 |
| 6 | NotMe | 7 | 8 | 1 | 2 | 3 | 4 |
| 7 | JustMe | | | | | | |
| 7 | NotMe | | | | | | |
So the first query should return the complete totals for both users, like this:
| RecNum | Project Description | sumReg | sumOT |
| 6 | Sample description | 40 | 52 |
| 7 | Another sample | 0 | 0 |
The second query should return the totals for just a specified user, (WHERE UserName = 'JustMe') like this:
| RecNum | Project Description | sumReg | sumOT |
| 6 | Sample description | 20 | 26 |
| 7 | Another sample | 0 | 0 |
Multiple parallel tables with the same structure is usually a sign of poor database design. The data should really be all in one table, with additional columns specifying the week.
You can, however, use union all to bring the data together. The following is an example of a query:
select pl.recNum, pl.ProjectDescription,
sum(Day1Reg + Day2Reg + Day3Reg) as reg,
sum(Day1OT + Day2OT + Day3OT) as ot
from ProjectList pl join
(select * from weekA union all
select * from weekB
) w
on pl.recNum = w.recNum
group by l.recNum, pl.ProjectDescription,;
In practice, you should use select * with union all. You should list the columns out explicitly. You can add appropraite where clauses or conditional aggregation to get the results you want in any particular case.
Please forgive my woefully limited understanding of SQL, but I'm hoping someone can help me. I need to alter a query written by someone else some time ago.
The query displays consumption per industry for a variety of industries in a number of areas. The table it spits out currently looks something like this:
+---------------+----------+---------+
| Economic area | Industry | Total |
+---------------+----------+---------+
| Area1 | | |
| | Ind1 | 459740 |
| | Ind2 | 43000 |
| | Ind3 | 0 |
| | Total | 502740 |
| Area2 | | |
| | Ind1 | 725560 |
| | Ind2 | 111017 |
| | Ind3 | 277577 |
| | Total | 1114154 |
+---------------+----------+---------+
Unfortunately, this table in conjunction with another table we publish on the number of producers in each industry and area can reveal commercially sensitive information when there are very few producers. For instance, in the table below, there's only one producer in Industry 2 in Area 1, so everything in the above table consumed by industry 2 in Area 1 goes to that producer.
+---------------+---------+------+------+------+
| Economic area | County | Ind1 | Ind2 | Ind3 |
+---------------+---------+------+------+------+
| Area1 | | | | |
| | county1 | 1 | 0 | 0 |
| | county2 | 3 | 1 | 2 |
| | county3 | 1 | 0 | 0 |
| | Total: | 5 | 1 | 2 |
| | | | | |
| Area2 | county4 | 5 | 0 | 1 |
| | county5 | 3 | 3 | 1 |
| | county6 | 1 | 0 | 1 |
| | county7 | 0 | 0 | 0 |
| | Total: | 9 | 3 | 3 |
+---------------+---------+------+------+------+
What I've been asked to do is to produce a condensed version of the first table that looks like the one below, where industries that have less than 3 producers in an area are aggregated into a generic Other Industry. Something like this:
+---------------+----------+--------+
| Economic area | Industry | All |
+---------------+----------+--------+
| Area1 | | |
| | Ind1 | 459740 |
| | OtherInd | 121376 |
| | Total | 581116 |
| Area2 | | |
| | Ind1 | 725560 |
| | Ind2 | 111017 |
| | Ind3 | 244 |
| | Total | 836821 |
+---------------+----------+--------+
I have been searching for a while, but haven't been able to find anything that works, or that I can understand well enough to make it work. I tried using a Count(Case(industry_code<3,1,0))... but I'm working in MS Access, so that doesn't work. I thought about using and IIF or a Switch statement, but it doesn't seem like either of those allow for the right type of comparison. I also found where someone suggested a From statement that had two different groupings - but Access spat out an error when I tried it.
The only marginal success I've had is with a HAVING (((Count(Allmills.industry_code))>3)), but it just drops the problem industries completely.
Currently the a somewhat simplified version of the query looks like this:
SELECT
economic_areas.economic_area AS [Economic area],
Industry_codes.industry_heading AS Industry,
Sum(Allmills.consumption) AS [All],
Sum(Allmills.[WA origin logs]) AS Washington
Allmills.industry_code,
Count(Allmills.industry_code) AS CountOfindustry_code,
Sum(Allmills.industry_code) AS SumOfindustry_code
FROM ((economic_areas INNER JOIN Allmills ON (economic_areas.state_abbrev =
Allmills.state_abbrev)
AND (economic_areas.economic_area_code = Allmills.economic_area_code))
INNER JOIN Industry_codes ON Allmills.display_industry_code =
Industry_codes.industry_code)
WHERE (((Allmills.economic_area_code) Is Not Null))
GROUP BY Allmills.display_industry_economic_area_code,
Allmills.display_industry_code, economic_areas.economic_area,
Industry_codes.industry_heading, Allmills.industry_code
ORDER BY Allmills.display_industry_economic_area_code,
Allmills.display_industry_code;
Any help would be greatly appreciated, even just suggestions of what types of techniques might be useful that I can look into elsewhere - I'm just running in circles right now.
HAVING is really solution here - change your query to use HAVING with > 3, add another query with HAVING <= 3, then UNION ALL the results of them
I'm trying to set up a report based on several tables.
I have a table Actual that looks like this:
+--------+------+
| status | date |
+--------+------+
| 5 | 7/10 |
| 8 | 7/9 |
| 8 | 7/11 |
| 5 | 7/18 |
+--------+------+
Table Targets looks like this:
+--------+-------------+--------+------------+
| status | weekEndDate | target | cumulative |
+--------+-------------+--------+------------+
| 5 | 7/12 | 4 | 45 |
| 5 | 7/19 | 5 | 50 |
| 8 | 7/12 | 4 | 45 |
| 8 | 7/19 | 5 | 50 |
+--------+-------------+--------+------------+
Grouping the Actual records by which Targets.weekEndDate they fall under, I have the following aggregate query GroupActual:
+-------------+------------+--------------+--------+------------+
| weekEndDate | status | weeklyTarget | actual | cumulative |
+-------------+------------+--------------+--------+------------+
| 7/12 | 5 | 4 | 1 | 45 |
| 7/12 | 8 | 4 | 2 | 41 |
| 7/19 | 5 | 5 | 1 | 50 |
| 7/19 | 8 | 4 | | 45 |
+-------------+------------+--------------+--------+------------+
I'm trying to create this report:
+--------+------------+------+------+
| status | category | 7/12 | 7/19 | ...etc for every weekEndDate entry in Targets
+--------+------------+------+------+
| 5 | actual | 1 | 1 |
| 5 | target | 4 | 5 |
| 5 | cumulative | 45 | 50 |
+--------+------------+------+------+
| 8 | actual | 2 | |
| 8 | target | 4 | 5 |
| 8 | cumulative | 45 | 50 |
+--------------+------+------+------+
I can use a crosstab query to make the date columns, but I'm not sure how to have rows for "actual", "target", and "cumulative". They aren't values in the same table, which means (I think) that a crosstab query won't be useful for this breakdown. Should I try to change GroupActual so that it puts the data in the shape I'm looking for? Kind of confused as to where to go next with this...
EDIT: I've made some headway on the crosstabs as per PowerUser's solution, but I'm having trouble with the one for Target. I modified the wizard's generated SQL in an attempt to get what I want but it's not working out. I used a version of GroupActual that only has the weekEndDate,status, and weeklyTarget columns; here's the SQL:
TRANSFORM weeklyTarget
SELECT status
FROM TargetStatus_forCrosstab_Target
GROUP BY status,weeklyTarget
PIVOT Format([weekEndDate],"Short Date");
You're almost there. The problem is that you can't do this all in a single crosstab. You need to make 3 crosstabs (one for 'actual', one for 'target', and one for 'cumulative'), then make a Union query to combine them all.
Additional Tip: In your individual crosstabs, add a Sort column. Your 'actual' crosstab will have a Sort value of 1, 'Target' will have a Sort value of 2, and 'Cumulative' will have 3. That way, when you union them together, you can get them all in the right order.