How to build an efficient query string in ASP.NET Core? - asp.net-core

Hello World,
I'm in research mode for one of feature to be built in our software and there one new thing that we have never faced.
The thing is, on one form we have a drop down with list of items. User can select default which means all items needs to be considered or else he can selectively opt for certain list items.
Actually the form is related to filter functionality depending upon users input the data is going to get filtered and displayed on UI.
The main problem we are trying to solve is suppose user selects default, which means all list items ID's are gonna be considered in POST call of API. The list can be huge, say 1 to 1K and above too.
So under such circumstances we can build the query string but, it seems its gonna be so huge. I have also studied that certain browsers support limited query string as per their standard limits.
So currently I have following doubts in mind.
Will shortening of query string work here ?
By which technique it can be handled efficiently ?
What performance considerations I need to take care during during so ?
Any suggestions or thoughts are welcome. That would boast my software design thinking.

based on what I understand from your question, here is my opinion:[if I understood wrong, please correct me, so I can help you]
You need to send query in URL and not in body or using JSON!is that correct?
I think you don't need to send every one of the selected items one by one!
If there are selected in serial, you can perform a range in your query!
Like http://abcd/test?id=1-43,6-765(take ID as string and then export the useful data in back-end) with this approach, you can shorten your query!
And also think about the database too (if there is any).querying this much data is use a lot of IO and make query low performance.

Related

Accessing the paging data from a native script in ElasticSearch

I am currently using a native script written in Java to filter the search results based on various forms of access control. The problem is that the access control verification takes a ridiculous amount of time per-record. There are some ways we can improve somewhat, but we came up with a work-around that will improve it drastically. The only problem is that I'm not sure if I can do it the way I want.
The solution: I need to stop assessing the access controls after the relevant number of results have been found.
The problem: I can't figure out how to access the offset and page size from within the script (implementing AbstractSearchScript at the moment) in order to decide when I've reached my minimum results. Does anyone have any idea how to get this data "properly" without making it a separate parameter of the script?
The bonus: I need to return a number of hits that is close to or larger than the actual number of hits. Since elasticsearch doesn't cache the results of the query, I can work around the problem by simply returning true for every result past the relevant ones. But I'd like to work a solution closer to Google's where I return a number of remaining results based on what percentage of the data was a hit so far. However, to do this (and to avoid potential complications) I'd like to just modifying the hits data directly. Is there any way to do this from a script?
You should probably use the "terminate_after" parameter introduced in ES 1.4, rather than trying to implement this yourself.
For your "bonus," I would just run the query a second time without the ACLs.

Is it bad to not use normalised tables in this database?

I recently learned about normalisation in my informatics class and I'm developing a multiplayer game using SQLite as backend database at the moment.
Some information on it:
The simplified structure looks a bit like the following:
player_id | level | exp | money | inventory
---------------------------------------------------------
1 | 3 | 120 | 400 | {item a; item b; item c}
Okay. As you can see, I'm storing a table/array in string form in the column "inventory". This is against normalization.
But the thing is: Making an extra table for the inventory of players brings only disadvantages for me!
The only points where I access the database is:
When a player joins the game and his profile is loaded
When a player's profile is saved
When a player joins, I load his data from the DB and store it in memory. I only write to the DB like every five minutes when the player is saved. So there are actually very few SQL queries in my script.
If I used an extra table for the inventory I would have to, upon loading:
Perform an performance and probably more data-intensive query to fetch all items from the inventory table which belong to player X
Walk through the results and convert them into a table for storage in memory
And upon saving:
Delete all items from the inventory table which belong to player X (player might have dropped/sold some items?)
Walk through the table and perform a query for each item the player owns
If I kept all the player data in one table:
I'd only have one query for saving and loading
Everything would be in one place
I would only have to (de)serialize the tables upon loading and saving, in my script
What should I do now?
Do my arguments and situation justify working against normalisation?
Are you saying that you think parsing a string out of "inventory" doesn't take any time or effort? Because everything you need to do to store/retrieve inventory items from a sub table is something you'd need to do with this string, and with the string you don't have any database tools to help you do it.
Also, if you had a separate subtable for inventory items, you could add and remove items in real time, meaning that if the app crashes or the user disconnects, they don't lose anything.
There are a lot of possible answers, but the one that works for you is the one to choose. Keep in mind, your choice may need to change over time.
If the amount of data you need to persist is small (ie: fits into a single table row) and you only need to update that data infrequently, and you don't have any reason to care about subsets of that data, then your approach makes sense. As time goes on and your players gain more items and you add more personalization to the game, you may begin to push up against the limits of SQLite, and you'll need to evolve your design. If you discover that you need to be able to query the item list to determine which players have what items, you'll need to evolve your design.
It's generally considered a good idea to get your data architecture right early, but there's no point in sitting in meetings today trying to guess how you'll use your software in 5-10 years. Better to get a design that meets this year's needs, and then plan to re-evaluate the design again after a year.
What's going to happen when you have one hundred thousand items in your inventory and you only want to bring back two?
If this is something that you're throwing together for a one off class and that you won't ever use again, then yes, the quick and dirty route might be a quicker option for you.
However if this is something you're going to be working on for a few months, then you're going to run into long-term issues with that design decision.
No, your arguments aren't valid. They basically boil down to "I want to do all of this processing in my client code instead of in SQL and then just write it all to a single field" because you are still doing all of the exact same processing to generate the string. By doing this you are removing the ability to easily load a small portion of the list and losing relationships to the actual item table which could contain more information about the items (I assume you're hard coding it all based on names instead of using internal item IDs which is a really bad idea, imo).
Don't do it. Long term the approach you are wanting to take will generate a lot more work for you as your needs evolve.
Another case of premature optimization.
You are trying to optimize something that you don't have any performance metrics. What is the target platform? Even crappiest computers nowadays could run at least hundreds of your reading operation per second. Then you add better hardware for more users, then you can go to cloud and when you come into problem space that Google, Twitter and Facebook are dealing with, you can consider denormalizing. Even then, best solution is some sort of key-value database.
Maybe you should check Wikipedia article on Database Normalization to remind you why normalized database is a good thing.
You should also think about the items. Are the items unique for every user or does user1 could have item1 and user2 have item1 to. If you now want to change item1 you have to go through your whole table and check which user have this item. If you would normalize your table, this would be much more easy.
But it the end, I think the answer is: It depends
Do my arguments and situation justify
working against normalisation?
Not based on what I've seen so far.
Normalized database designs (appropriately indexed and with efficient usage of the database with UPSERTS, transactions, etc) in general-purpose engines will generally outperform code except where code is very tightly optimized. Typically in such code, some feature of the general purpose RDBMS engine is abandoned, such as one of the ACID properties or referntial integrity.
If you want to have very simple data access (you tout one table, one query as a benefit), perhaps you should look at a document centric database like mongodb or couchdb.
The reason that you use any technology is to leverage the technology's advantages. SQL has many advantages that you seem to not want to use, and that's fine, if you don't need them. In Neal Stephenson's Zodiac, the main character mentions that few things bought from a hardware store are used for their intended purpose. Software's like that, too. What counts is that it works, and it works nearly 100% of the time, and it works fast enough.
And yet, I can't help but think that someday you're going to have some overpowered item released into the wild, and you're going to want to deal with this problem at the database layer. Say you accidently gave out some superinstakillmegadeathsword inventory items that kill everything within 50 meters on use (wielder included), and you want to remove those things from play. As an apology to the people who lose their superinstakillmegadeathsword items, you want to give them 100 money for each superinstakillmegadeathsword you take away.
With a properly normalized database structure, that's a trivial task. With a denormalized structure, it's quite a bit harder and slower. A normalized database is also going to be easier to expand on the design in the future.
So are you sure you don't want to normalize your database?

How do you think while formulating Sql Queries. Is it an experience or a concept?

I have been working on sql server and front end coding and have usually faced problem formulating queries.
I do understand most of the concepts of sql that are needed in formulating queries but whenever some new functionality comes into the picture that can be dont using sql query, i do usually fails resolving them.
I am very comfortable with select queries using joins and all such things but when it comes to DML operation i usually fails
For every query that i never done before I usually finds uncomfortable with that while creating them. Whenever I goes for an interview I usually faces this problem.
Is it their some concept behind approaching on formulating sql queries.
Eg.
I need to create an sql query such that
A table contain single column having duplicate record. I need to remove duplicate records.
I know i can find the solution to this query very easily on Googling, but I want to know how everyone comes to the desired result.
Is it something like Practice Makes Man Perfect i.e. once you did it, next time you will be able to formulate or their is some logic or concept behind.
I could have get my answer of solving above problem simply by posting it on stackoverflow and i would have been with an answer within 5 to 10 minutes but I want to know the reason. How do you work on any new kind of query. Is it a major contribution of experience or some an implementation of concepts.
Whenever I learns some new thing in coding section I tries to utilize it wherever I can use it. But here scenario seems to be changed because might be i am lagging in some concepts.
EDIT
How could I test my knowledge and
concepts in Sql and related sql
queries ?
Typically, the first time you need to open a child proof bottle of pills, you have a hard time, but after that you are prepared for what it might/will entail.
So it is with programming (me thinks).
You find problems, research best practices, and beat your head against a couple of rocks, but in the process you will come to have a handy set of tools.
Also, reading what others tried/did, is a good way to avoid major obsticles.
All in all, with a lot of practice/coding, you will see patterns quicker, and learn to notice where to make use of what tool.
I have a somewhat methodical method of constructing queries in general, and it is something I use elsewhere with any problem solving I need to do.
The first step is ALWAYS listing out any bits of information I have in a request. Information is essentially anything that tells me something about something.
A table contain single column having
duplicate record. I need to remove
duplicate
I have a table (I'll call it table1)
I have a
column on table table1 (I'll call it col1)
I have
duplicates in col1 on table table1
I need to remove
duplicates.
The next step of my query construction is identifying the action I'll take from the information I have.
I'll look for certain keywords (e.g. remove, create, edit, show, etc...) along with the standard insert, update, delete to determine the action.
In the example this would be DELETE because of remove.
The next step is isolation.
Asnwer the question "the action determined above should only be valid for ______..?" This part is almost always the most difficult part of constructing any query because it's usually abstract.
In the above example you're listing "duplicate records" as a piece of information, but that's really an abstract concept of something (anything where a specific value is not unique in usage).
Isolation is also where I test my action using a SELECT statement.
Every new query I run gets thrown through a select first!
The next step is execution, or essentially the "how do I get this done" part of a request.
A lot of times you'll figure the how out during the isolation step, but in some instances (yours included) how you isolate something, and how you fix it is not the same thing.
Showing duplicated values is different than removing a specific duplicate.
The last step is implementation. This is just where I take everything and make the query...
Summing it all up... for me to construct a query I'll pick out all information that I have in the request. Using the information I'll figure out what I need to do (the action), and what I need to do it on (isolation). Once I know what I need to do with what I figure out the execution.
Every single time I'm starting a new "query" I'll run it through these general steps to get an idea for what I'm going to do at an abstract level.
For specific implementations of an actual request you'll have to have some knowledge (or access to google) to go further than this.
Kris
I think in the same way I cook dinner. I have some ingredients (tables, columns etc.), some cooking methods (SELECT, UPDATE, INSERT, GROUP BY etc.) then I put them together in the way I know how.
Sometimes I will do something weird and find it tastes horrible, or that it is amazing.
Occasionally I will pick up new recipes from the internet or friends, then use parts of these in my own.
I also save my recipes in handy repositories, broken down into reusable chunks.
On the "Delete a duplicate" example, I'd come to the result by googling it. This scenario is so rare if the DB is designed properly that I wouldn't bother keeping this information in my head. Why bother, when there is a good resource is available for me to look it up when I need it?
For other queries, it really is practice makes perfect.
Over time, you get to remember frequently used patterns just because they ARE frequently used. Rare cases should be kept in a reference material. I've simply got too much other stuff to remember.
Find a good documentation to your software. I am using Mysql a lot and Mysql has excellent documentation site with decent search function so you get many answers just by reading docs. If you do NOT get your answer at least you are learning something.
Than I set up an example database (or use the one I am working on) and gradually build my SQL. I tend to separate the problem into small pieces and solve it step by step - this is very successful if you are building queries including many JOINS - it is best to start with some particular case and "polute" your SQL with many conditions like WHEN id = "123" which you are taking out as you are working towards your solution.
The best and fastest way to learn good SQL is to work with someone else, preferably someone who knows more than you, but it is not necessarry condition. It can be replaced by studying mature code written by others.
Your example is a test of how well you understand the DISTINCT keyword and the GROUP BY clause, which are SQL's ways of dealing with duplicate data.
Examples and experience. You look at other peoples examples and you create your own code and once it groks, you don't need to think about it again.
I would have a look at the Mere Mortals book - I think it's the one by Hernandez. I remember that when I first started seriously with SQL Server 6.5, moving from manual ISAM databases and Access database systems using VB4, that it was difficult to understand the syntax, the joins and the declarative style. And the SQL queries, while powerful, were very intimidating to understand - because typically, I was looking at generated code in Microsoft Access.
However, once I had developed a relatively systematic approach to building queries in a consistent and straightforward fashion, my skills and confidence quickly moved forward.
From seeing your responses you have two options.
Have a copy of the specification for whatever your working on (SQL spec and the documentation for the SQL implementation (SQLite, SQL Server etc..)
Use Google, SO, Books, etc.. as a resource to find answers.
You can't formulate an answer to a problem without doing one of the above. The first option is to become well versed into the capabilities of whatever you are working on.
The second option allows you to find answers that you may not even fully know how to ask. You example is fairly simplistic, so if you read the spec/implementation documentaion you would know the answer right away. But there are times, where even if you read the spec/documentation you don't know the answer. You only know that it IS possible, just not how to do it.
Remember that as far as jobs and supervisors go, being able to resolve a problem is important, but the faster you can do it the better which can often be done with option 2.

Business Applications: What are the fundamental features of a search form?

In a typical business application it is quite common to have forms that are used for searching.
Some basic features are:
A pane that contains the search criteria
A grid to display the results
Sorting on the grid
A detail page that opens when an item is selected in the results grid
What other features would you expect in a business application's search functionality?
Maybe it's a bit trite but there is some sense in this picture:
removed dead ImageShack link
Do it as it shown at the second example, not as at the 3rd one.
There is a well known extreme programming principle - YAGNI. I think it's absolutely appliabe to almost any problem. You always can add something new if it's necessary, but it's much more difficult to remove something what is already exist because someone already uses it even if it's wrong.
How about the ability to save search criteria, in order to easily re-run a search later. Or, the ability to easily, cleanly, print the list of results.
If search refining is allowed (given a search result, limited future searches to the current results), you may also want to add a breadcrumb system, so that the user can see the sequence of refinements that lead you to the current result-set -- and by clicking on a breadcrumb, return to a previous refinement stage.
Faceted search:
(source: msdn.com)
This is displayed in the area in the right ellipse. There are filters and the engine shows the number of results that will remain after aplying the filter. This is very useful and can be done without pain in some search engines, such as Apache Solr. Of course, implement this only if filters make sense in your task.
Aggregate summary info, like total(s), count(s) or percentages.
One or more menus, like right click context for the grid, a ribbon or menu on top.
Your list for the UI elements is kinda good. Export, print (asking them whether it is really necessary to print this?), category/tag and language selection is worth to consider. Smart and working pagination (don't forget ordering).
Please do not force a search to open in a new (or even worse, always in the same window). Links of search results should be copy-pastable (always use GET),
But it really matters to have a functional (i.e. a really good) algorithm. Mostly I google company websites, because their search engine is, cough, awwwwkward. Looking for a feature chart, technical spec, pricing etc. one is not interested in press releases and vica-versa.
Search engine providers offer integration into company websites.
Use Auto-complete wherever possible on your text input fields.
If using selects or combo boxes with related information try and use chain selects to organise the information.
Where results depend on location try and serve relevant results.
Also remember to keep the search form as simple as possible even down to one text field. To refine the search you can have an alternate form as an "Advanced Search interface".
Printing, export.
A grid to display the results
Watch out not to display results a user is not authorized to see (roles / permissions / access rights).
A detail page that opens when an item is selected in the results grid
In case a user attempts to circumvent the search page links and enter some document directly, again, check out for permissions.
Validation, validation, validation.
It should be very hard, near impossible, for me to run a query that makes no sense. ie, start date occurring after an end date.
Export a numerical dataset (even if it only has one numeric column - so just make it so by default) to CSV for import into Excel (people love this function, even if only 1% of users seem to use it with any regularity. Just ask yourself when's the last time you highlighted something for copy-n-paste. Would it have been easier to open a CSV?
Refinable searches (think Google's use of site: -). People who use the search utility a lot will appreciate this. People who don't won't know it's not there.
The ability to choose to display 1 records, 5 records, 100 records, 1000 records, etc. "Paging" I believe is what we most commonly call it ;).
You mentioned sortable grids. Somebody else mentioned auto-sum or auto-count. Those are good if (once again) you have largely numeric data. But those are almost report-oriented functions.
Hope this helps.
One thing you can do is have a drop down of most common searches in plain english. e.g. "High value sales in New York in last 5 days". This is the equivalent of user selecting an amount, the city, date ranges etc. done conveniently for them.
Another thing is to have multiple search criteria tabs based on perspective of the user. Like "sales search", "reporting search", "admin search" etc.
ALso consider limiting the number of entries retrieved in the search and allow users to do more narrow searches. This depends on the business needs however.
The most commonly used search option listed first and in a prominent location.
I think your requirements are good. Take a cue from Google. Google got it right. One text box where you type whatever you want, and your engine spits out the answers. Most folks will try this, and if the answers are good enough, then that is what they will use. In the back-end, you'll probably want to flatten all of the data into a big honkin' table and then index it or use a SQL query with "LIKE" in it.
However, you will probably want to allow the user to refine the search. For this, have a link to "Advanced Search" and use a form there to specify filter criteria. This lets the user zero in on the results if basic search is not good enough. For the results on th is page, you will certainly want to have sorting on key fields, but do it after you have produced the initial result set.
It depends on the content that you are searching for.. make it relevant :) Search always look easy but can be incredibly difficult to get right.
Not mentioned yet, but very important I think - a search that actually works. This item is often neglected and makes the rest a bit moot.

Any SQL database: When is it better to fetch a whole table instead of querying for particular rows?

I have a table that contains maybe 10k to 100k rows and I need varying sets of up to 1 or 2 thousand rows, but often enough a lot less. I want these queries to be as fast as possible and I would like to know which approach is generally smarter:
Always query for exactly the rows I need with a WHERE clause that's different all the time.
Load the whole table into a cache in memory inside my app and search there, syncing the cache regularly
Always query the whole table (without WHERE clause), let the SQL server handle the cache (it's always the same query so it can cache the result) and filter the output as needed
I'd like to be agnostic of a specific DB engine for now.
with 10K to 100K rows, number 1 is the clear winner to me. If it was <1K I might say keep it cached in the application, but with this many rows, let the DB do what it was designed to do. With the proper indexes, number 1 would be the best bet.
If you were pulling the same set of data over and over each time then caching the results might be a better bet too, but when you are going to have a different where all the time, it would be best to let the DB take care of it.
Like I said though, just make sure you index well on all the appropriate fields.
Seems to me that a system that was designed for rapid searching, slicing, and dicing of information is going to be a lot faster at it than the average developers' code. On the other hand, some factors that you don't mention include the location or potential location of the database server in relation to the application - returning large data sets over slower networks would certainly tip the scales in favor of the "grab it all and search locally" option. I think that, in the 'general' case, I'd recommend querying for just what you want, but that in special circumstances, other options may be better.
I firmly believe option 1 should be preferred in an initial situation.
When you encounter performance problems, you can look on how you could optimize it using caching. (Pre optimization is the root of all evil, Dijkstra once said).
Also, remember that if you would choose option 3, you'll be sending the complete table-contents over the network as well. This also has an impact on performance .
In my experience it is best to query for what you want and let the database figure out the best way to do it. You can examine the query plan to see if you have any bottlenecks that could be helped by indexes as well.
First of all, let us dismiss #2. Searching tables is data servers reason for existence, and they will almost certainly do a better job of it than any ad hoc search you cook up.
For #3, you just say 'filter the output as needed" without saying where that filter is been done. If it's in the application code as in #2, than, as with #2, than you have the same problem as #2.
Databases were created specifically to handle this exact problem. They are very good at it. Let them do it.
The only reason to use anything other than option 1 is if the WHERE clause itself is huge (i.e. if your WHERE clause identifies each row individually, e.g. WHERE id = 3 or id = 4 or id = 32 or ...).
Is anything else changing your data? The point about letting the SQL engine optimally slice and dice is a good one. But it would be surprising if you were working with a database and do not have the possibility of "someone else" changing the data. If changes can be made elsewhere, you certainly want to re-query frequently.
Trust that the SQL server will do a better job of both caching and filtering than you can afford to do yourself (unless performance testing shows otherwise.)
Note that I said "afford to do" not just "do". You may very well be able to do it better but you are being paid (presumably) to provide functionality not caching.
Ask yourself this... Is spending time writing cache management code helping you fulfil your requirements document?
if you do this:
SELECT * FROM users;
mysql should perform two queries: one to know fields in the table and another to bring back the data you asked for.
doing
SELECT id, email, password FROM users;
mysql only reach the data since fields are explicit.
about limits: always ss best query the quantity of rows you will need, no more no less. more data means more time to drive it