What is the most performant way to build and execute a multiple where clause in SQL from a single table of identifiers? - sql

Here's the challenge. Users want to be able to create filters based on N-criteria and the criteria being used for the filter is a fluid heirarchy. To simplify it, let's use two hierarchies that the user could select from:
All Territories
Europe
UK
France
Americas
US
Canada
Mexico
Media
Music
Downloads
CDs
Movies
Streaming
DVD
Objects would have a table of tags associated with them. The ObjectsTags table would contain an indicator as two which type of data the tag is linked to
The issue is that user would want to select and group the tags they want to filter by. So they might want Movies in Europe so they would select those three tags as a single grouped filter. It's easy enough to get a filter based on those three tags that says:
Any object that has a tag of: (All Territories OR Europe OR UK or FRANCE) AND (All Media OR Movies OR DVD OR Streaming). The challenge is that I need to support any number of new hierarchies that might be needed and any level of filters, since a user could also want a filter that returns everything from that filter as well as all of the CDs in the US.
Is there any new feature in SQL Server that would be better suited for handling this type of a where clause in a performant way?

You are either going to have to create your where clause dynamically, or you will pre-create the SQL using a where clause similar to the following:
where country = coalesce(p_country, country)
and media = coalesce(p_media, medias)
and music = coalesce(p_music, music)
The really cool part of this statement? Your performance will be the
worst that it can possibly be.
I recommend creating a dynamic statement with the specific conditions you need.

Related

how to fix spelling mistakes in a database have multiple records in that records there are more records

i have a database having country, city, state and hotels in these table country name has multiple identical records for eg mexico is wrongly spelled as maxico and mxico and mexico,other records like usa and united states of america and america these type of records are having mutiple same wrongly spelled states and states has multiple wrong spelled cities but hotels are unique and i want them to set them to there right city and state and country for eg. some hotel is in chicago city Illinois state and country is usa. please help me how can i fix this
you could do an update if you know all the different scenarios that are incorrect
update tbl
set city = 'Mexico'
where city in ('maxico', 'mxico')
Well,you can list all values the country column has,and then check wether the values is right, if it is wrong, just use update clause to fix the wrong value, like below:
update my_table set country = 'Mexico' where country in ('maco', 'xico');
It depends on infrastructure you're running.
If you have access to some ETL tools, they often have DataQuality capabilities, often with databases used in correcting adresses. Those are often paid.
If you are a "private" developer, then you might not want to use paid data, so you can look for open data sources, like https://catalog.data.gov allegheny country addresses.
You can use multitude of algorithms and solutions, ranging from simple distances in word space to neural networks pre-trained to do just that.
This type of data problem is hard. There is no built-in simple way to determine the "right spelling". Many databases have one of two capabilities built in that can help -- either "soundex" algorithms or Levenshtein distance.
What should you do? If you really want to fix this problem, create a table with the misspelled name and the correct value that you want. This table will need to be maintained manually, such as in a spreadsheet. Then use this table when importing data and use only the rectified value.
Better yet, set up a reference table with only the correct names. Create a second table with alternative names, which is maintained as above.

Tableau count values after a GROUP BY in SQL

I'm using Tableau to show some schools data.
My data structure gives a table that has all de school classes in the country. The thing is I need to count, for example, how many schools has Primary and Preschool (both).
A simplified version of my table should look like this:
In that table, if I want to know the number needed in the example, the result should be 1, because in only one school exists both Primary and Preschool.
I want to have a multiple filter in Tableau that gives me that information.
I was thinking in the SQL query that should be made and it needs a GROUP BY statement. An example of the consult is here in a fiddle: Database example query
In the SQL query I group by id all the schools that meet either one of the conditions inside de IN(...) and then count how many of them meet both (c=2).
Is there a way to do something like this in Tableau? Either using groups or sets, using advanced filters or programming a RAW SQL calculated fiel?
Thanks!
Dubafek
PS: I add a link to my question in Tableu's forum because you can download my testing workbook there: Tableu's forum question
I've solved the issue using LODs (specifically INCLUDE and EXCLUDE statements).
I created two calculated fields having the aggregation I needed:
Then I made a calculated field that leaves only the School IDs that matches the number of types they have (according with the filtering) with the number of types selected in the multiple filter (both of the fields shown above):
Finally, I used COUNTD([Condition]) to display the amounts of schools matching with at least the School types selected.
Hope this helps someone with similar issue.
PS: If someone wants the Workbook with the solution I've uploaded it in an answer in the Tableau Forum

SQL complicated query with joins

I have problem with one query.
Let me explain what I want:
For the sake of bravity let's say that I have three tables:
-Offers
-Ratings
-Users
Now what I want to do is to create SQL query:
I want Offers to be listed with all its fields and additional temporary column that IS NOT storred anywhere called AverageUserScore.
This AverageUserScore is product of grabbing all offers, belonging to particular user and then grabbing all ratings belonging to these offers and then evaluating those ratings average - this average score is AverageUserScore.
To explain it even further, I need this query for Ruby on Rails application. In the browser inside application you can see all offers of other users , with AverageUserScore at the very end, as the last column.
Associations:
Offer has many ratings
Offer belongs to user
Rating belongs to offer
User has many offers
Assumptions made:
You actually have a numeric column (of any type that SQL's AVG is fine with) in your Rating model. I'm using a column ratings.rating in my examples.
AverageUserScore is unconventional, so average_user_score is better.
You don't mind not getting users that have no offers: average rating is not clearly defined for them anyway.
You don't deviate from Rails' conventions far enough to have a primary key other than id.
Displaying offers for each user is a straightforward task: in a loop of #users.each do |user|, you can do user.offers.each do |offer| and be set. The only problem here is that it will execute a separate query for every user. Not good.
The "fetching offers" part is a standard N+1 counter seen even in the guides.
#users = User.includes(:offers).all
The interesting part here is only getting the averages.
For that I'm going to use Arel. It's already part of Rails, ActiveRecord is built on top of it, so you don't need to install anything extra.
You should be able to do a join like this:
User.joins(offers: :ratings)
And this won't get you anything interesting (apart from filtering users that have no offers). Inside though, you'll get a huge set of every rating joined with its corresponding offer and that offer's user. Since we're taking averages per-user we need to group by users.id, effectively making one entry per one users.id value. That is, one per user. A list of users, yes!
Let's stop for a second and make some assignments to make Arel-related code prettier. In fact, we only need two:
users = User.arel_table
ratings = Rating.arel_table
Okay. So. We need to get a list of users (all fields), and for each user fetch an average value seen on his offers' ratings' rating field. So let's compose these SQL expressions:
# users.*
user_fields = users[Arel.star] # Arel.star is a portable SQL "wildcard"
# AVG(ratings.rating) AS average_user_score
average_user_score = ratings[:rating].average.as('average_user_score')
All set. Ready for the final query:
User.includes(:offers) # N+1 counteraction
.joins(offers: :ratings) # dat join
.select(user_fields, average_user_score) # fields we need
.group(users[:id]) # grouping to only get one row per user

How to cache sql queries, the result of which never change?

I have three tables - Countries, USRegions and Regions. For the different countries Regions holds a particular administrative division of the territory of that country. For the US these are the 50 states and Washington, D.C.. So again:
Countries - USA, Germany, Japan etc.
USRegions - Pacific, Middle Atlantic, South Atlantic (a total of 9 rows)
Regions - Virginia, Texas etc., but also similar divisions for other countries, like regional cities for Poland etc.
In one of the views I have a form, where the user can select certain US Regions and certain States. So in order to display the names of these, I make two calls to the database. One with all rows to the USRegions table and one with all rows with country_id = USA.id to the Regions table. The thing is - the result of these calls will always be the same (unless there are new states, which is very unlikely, although possible). So is there a way to cache these calls and if yes - how and where?
Rails automatically caches the SQL queries if it encounters the same query again. Here is the reference link for this. SQL Caching. If still you want to cache it then you can use Low Level Caching. Use a key based caching it would do the work.

Localisation of country names

As part of addresses I am storing in my SQL database country codes (e.g. US, DE,...). I then have another table (with two columns) in my database which translates the country codes to the English language names of the respective countries.
If I want to make the site multi-language, I could expand this translation table adding country names in other languages than English.
I was wondering if there is another method which does not involve modification of the database, e.g. using gettext to translate the English country names?
The typical way to handle this is to change the table structure to have three columns, instead of two:
Language
CounryCode
FullName
Whenever you query the database, you would provide the current language.
You then have to change your code to include the additional language key in any queries.
Depending on how you are going to keep track of the current language, you would also use a view or user defined function.
You don't want to use automated translation, since the name of a country like "China" could turn into the equivalent of "porcelain".