Best Way to query/visualizate period data (SQL/BI)

Best Way to query/visualizate period data (SQL/BI) - sql

lately we have a few cases were I had to build reports where I have one table like:
1|text|text
2|text|text
3|text|text
and another table
1|1.1.2017|text
1|1.2.2017|text
2|1.1.2017|text
2|1.2.2017|text
3|1.1.2017|and so on
result should be:
Jan|Feb|March|...
1|text|text|2 | 1 | ...
2|text|text|2 | 1 | ...
3|text|text|1 | 1 | ...
My first question would be if there is common way to do this. I already build querys to do this but maybe not so sufficient as they could be. Seems to me as a very common business case so maybe there are (standardized) techniques to do this which I don't know yet.
Another question would be: The queried data would go into a BI tool later. So is it maybe better (faster) to do first the queries, put the tables in BI tools and then manipulate the data as desired? Maybe someone has experience in this and could give me advice...
Thanks

Have a look at the answer to duplicate ID in sql select - except the first sentence.
In my experience it is best to do as you suggested: use SQL to group and aggregate the data as you want and then use a BI tool to present it.

Related

SSIS Fuzzy Grouping Always return the same result with different similarity thrshold

Can anyone tell me why my similarity is always 1.
My goal is AAB and AAC can be set as the same group for example.
Thanks

After I tried different source data, I got the goal what I need.
I think for sample data, it should be better to use some real example in the world.
Instead of AAA and AAC, maybe use Name column like Sara vs Saraa then ssis would say they are in the same group. However, i found for Don vs Done, they won't. So....it may not good idea to filter the records that has typo with different letter?
*** try to create more than one column to be you comparison column

In SQL, what is the memory-efficient way of "mapping" 1 ID to multiple IDs?

I'll describe my scenario so you guys understand what type of design pattern I'm looking for.
I'm making an application where I provide someone with a link that is associated with one or more files. For example, someone needs somePowerpoint.ppx, main.cpp and somevid.mp4, and I have a tool that makes kj13h1djdsja213j1hhadad9933932 associated with those 3 files so that I can give someone
mysite.com/getfiles?fid=kj13h1djdsja213j1hhadad9933932
and they'll get a list of those files that they can download individually or all at once.
Since I'm new to SQL, the only way I know of doing that is having my tool use a table like
fid | filename
------------------------------------------------------------------
kj13h1djdsja213j1hhadad9933932 somePowerpoint.ppx
kj13h1djdsja213j1hhadad9933932 main.cpp
kj13h1djdsja213j1hhadad9933932 somevid.mp4
jj133823u22h248884h4h24h01h232 someotherfile.someextension
to go along with the above example. It would be nice if I could do some equivalent of
fid | filename(s)
---------------------------------------------------------------------------
kj13h1djdsja213j1hhadad9933932 somePowerpoint.ppx, main.cpp, somevid.mp4
jj133823u22h248884h4h24h01h232 someotherfile.someextension
but I'm not sure if that's possible or if I should be using some other design pattern altogether.
Any advice?

I believe Concatenate many rows into a single text string? can help give you a query that would generate your condensed format (you'd still want to store it in SQL with the full list, but you could make a view showing the condensed version using the query in the link)

SQL - Calculating columns using dynamic functions

I'm trying to create a set of data that I'm going to write out to a file, it's essentially a report composed of various fields from a number of different tables, some columns need to have some processing done on them, some can just be selected.
Different users will likely want different processing performed on certain columns, and in the future, I'll probably need to add additional functions for computed columns.
I'm considering the cleanest/most flexable approach to storing and using all the different functions I'm likely to need for these computed columns, I've got two ideas in my head, but I'm hoping there might be a much more obvious solution I'm missing.
For a simple, slightly odd example, a Staff table:
Employee | DOB | VacationDays
Frank | 01/01/1970 | 25
Mike | 03/03/1975 | 24
Dave | 05/02/1980 | 30
I'm thinking I'd either end up with a query like
SELECT NameFunction(Employee, optionID),
DOBFunction(DOB, optionID),
VacationFunction(VacationDays, optionID),
from Employee
With user defined functions, where the optionID would be used in a case statement inside the functions to decide what processing to perform.
Or I'd want to make the way the data is returned customisable using a lookup table of other functions:
ID | Name | Description
1 | ShortName | Obtains 3 letter abbreviation of employee name
2 | LongDOB | Returns DOB in format ~ 1st January 1970
3 | TimeStampDOB | Returns Timestamp for DOB
4 | VacationSeconds | Returns Seconds of vaction time
5 | VacationBusinessHours | Returns number of business hours of vacation
Which seems neater, but I'm not sure how I'd formulate the query, presumably using dynamic SQL? Is there a sensible alternative?
The functions will be used on a few thousand rows.
The closest answer I've found was in this thread:
Call dynamic function name in SQL
I'm not a huge fan of dynamic SQL, although in this case I think it might be the best way to get the result I'm after?
Any replies appreciated,
Thanks,
Chris

I would go for the second solution. You could even use real stored proc names in your lookup table.
create proc ShortName (
#param varchar(50)
)as
begin
select 'ShortName: ' + #param
end
go
declare #proc sysname = 'ShortName'
exec #proc 'David'
As you can see in the example above, the first parameter of exec (i.e. the procedure name) can be a parameter. This said with all the usual warnings regarding dynamic sql...

In the end, you should go with whichever is faster, so you should try both ways (and any other way someone might come up with) and decide after that.
I like better the first option, as long as your functions don't have extra selects to a table. You may not even need the user defined functions, if they are not going to be reused in a different report.
I prefer to use Dynamic SQL ony to improve a query's performance, such as adding a dynamic ordering or adding / removing complex WHERE conditions.
But these are all subjective opinions, the best thing is try, compare, and decide.

Actually, this isn't a question of what's faster. It is a question of what makes the code cleaner, particularly for adding new functionality (new columns, new column formats, re-ordering them).
Don't think of your second approach as "using dynamic SQL", since that tends to have negative connotations. Instead, think of it as a data-driven approach. You want to build a table that describes the columns that users can get, and the formats. This is great! Users can then provide a list of columns, and you'll have a magical stored procedure that combines the information from the users with the information in your metadata table, and produces the desired result.
I'm a big fan of data-driven approaches, and dynamic SQL is the best SQL tool I've found so far for implementing them.

Adding columns dynamically to a View or return from Stored Procedure

I've found a lot of bits and pieces of this, but I can't put the together. This is basically the idea of the table where name is a varchar, date is a datetime, and number is an int
Name | Date | Number
A 1-2-11 15
B 1-2-11 8
A 1-1-11 5
I'd like to create a view that looks like this
Name | 1-2-11 | 1-1-11
A 15 5
B 8
At first I was using a temp table, and appending each date row to it. I read on another forum that way was a major resource hog. Is that true? Is there a better way to do this?

I would combine dynamic SQL with a pivot as I mentioned in this answer.

You want to look into "cross-tab" or "pivot" statements. In SQL Server 2005 and up, its PIVOT, but syntax varries between platform.
This is a very complex subject, particuarly since you want to add columns to a view as your data grows over time. Besides your platform's documentation, check out the myriad other SO posts on the subject.

If the date column is a known set then you can use pivot in some cases.
It is often faster to use dynamic sql BUT this can be very dangerous so be wary.
To really know what the best solution is for your problem we would need some more information -- how much data -- how much variation is expected in the different columns, etc.
However, it is true, both PIVOT and Dynamic SQL will be faster than a temp table.

I would do it with Access or Excel instead of T-SQL.

Query sql on string

I have a db with users that have all this record .
I would like to do a query on a data like
CN=aaa, OU=Domain,OU=User, OU=bbbbbb,OU=Department, OU=cccc, OU=AUTO, DC=dddddd, DC=com
and I need to group all users by the same ou=department.
How can I do the select with the substring to search a department??
My idea for the solution is to create another table that is like this:
---------------------------------------------------
ldapstring | society | site
---------------------------------------------------
"CN=aaa, OU=Domain,OU=User, OU=bbbbbb,OU=Department, OU=cccc, OU=AUTO, DC=dddddd, DC=com" | societyName1 | societySite1
and my idea is to compare the string with these on the new table with the tag like but how can I take the society and site when the like string occurs?????
Please help me

You could always do ColumnName LIKE '%OU=Department%'.
Regardless, I think this needs to be normalized into a better table, if possible. Multivalue columns should be avoided as much as possible.
IF you aren't dealing with a database, the next best thing would be a regular expression.

Maybe you should look into MySQL regular expressions. I, myself, have never used it, but just wanted to suggest it :-)
http://dev.mysql.com/doc/refman/5.1/en/regexp.html

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Best Way to query/visualizate period data (SQL/BI) - sql

Have a look at the answer to duplicate ID in sql select - except the first sentence. In my experience it is best to do as you suggested: use SQL to group and aggregate the data as you want and then use a BI tool to present it.

Related

SSIS Fuzzy Grouping Always return the same result with different similarity thrshold

In SQL, what is the memory-efficient way of "mapping" 1 ID to multiple IDs?

SQL - Calculating columns using dynamic functions

Adding columns dynamically to a View or return from Stored Procedure

Query sql on string

Categories

Resources