Is there any time difference between different responded requests, - api

I am curious about one thing.
For example let's say, I am going to update my email on a website. My userId is: 1.
My request body for a successful request is:
currentEmail: example1#gmail.com,
newEmail: example2#gmail.com,
userId: 1
I will get a successful response for above. But the request below will be failed because there isn't any user with the userId of 2:
currentEmail: example1#gmail.com,
newEmail: example2#gmail.com,
userId: 2
Will there be any reasonable time difference between these two? Because second request won't trigger any database writing.
And also let's say I will try to find a user with a userId.
GET api/findUser/{userId}
If there isn't any user with the userId above, will there be any time difference between successful and failed requests?

If you're "curious" - go ahead and measure it, the real results will depend on the API implementation, DB implementation, caching on DB and ORM levels, etc.
For example if in 1st case the API just calls SQL Update statement - the execution time should be similar, however if the API builds a "user" DTO first - the attempt to amend non-existing user will be faster.
In the latter case my expectation is that attempt to get info for the user which doesn't exist will be faster, however it also "depends".
So you need to inspect the associated code execution footprint, query plans and maybe even load test the database separate from the API.

Related

How to know the first ran jobId of a cached query in big-query?

When we run a query in big-query environment, the results are cached in the temporary table. From next time onwards, when we ran the same query multiple times, the subsequent runs will fetch the results from the cache for the next 24 hrs with some exceptions. Now my use case is, in the subsequent runs, i want to know like from which jobId this query cache results are got, previous first time run of the query ??
I have checked all the java docs related to query didn't find that info. We have cacheHit variable, which will tell you whether the query has fetched from the cache or not . Here i want to know one step further, from what jobId, the results got fetched. I expected like, may be in this method i can know the info, but i am always getting the null value for that. I also want to know what is meant by parentJob in big-query context.
It's unclear why you'd even care about this other than as a technical exercise. If you want to build your own application caching layer that's a different concern. More details about query caching can be found on https://cloud.google.com/bigquery/docs/cached-results.
The easiest way to do this would probably be by traversing jobs.list until you find a job that has the same destination table (it'll be prefaced with an anon prefix), and where the cacheHit stat is false/not present.
Your inquiry about parentJob is unrelated to this exercise. It's for finding all the child jobs created as part of a script or multi-statement execution. More information about this can be found on https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting-concepts.

Create master table for status column

I have a table that represent a request sent through frontend
coupon_fetching_request
---------------------------------------------------------------
request_id | request_time | requested_by | request_status
Above I tried to create a table to address the issue.
Here request_status is an integer. It could have some values as follows.
1 : request successful
2 : request failed due to incorrect input data
3 : request failed in otp verification
4 : request failed due to internal server error
That table is very simple and status is used to let frontend know what happened to sent request. I had discussion with my team and other developers were proposing that we should have a status representation table. At database side we are not gonna need this status. But team was saying that in future we may need to show simple output from database to show what is the status of all request. According to YAGNI principle I don't think it is a good idea.
Currently I have coded to convert returned request_status value to descriptive value at frontend. I tried to convince team that I can creat an enumuration at business layer to represent meaning of the status OR I could add documentation at frontend and in java but failed to convince them.
The table proposed is as follows
coupon_fetching_request_status
---------------------------------------------------
status_id | status_code | status_description
My question is, Is it necessary to create table for such a simple status in similar cases.
I tried to create simple example to address the problem. In real time the table is to represent a Discount Coupon Code Request and status representing if the code is successfully fetched
It really depends on your use case.
To start with: in you main table, you are already storing request_status as an integer, which is a good thing (if you were storing the whole description, like 'request successful', that would not be optimized).
The main question is: will you eventually need to display that data in a human-readable format?
If no, then it is probably useless to create a representation table.
If yes, then having a representation table would be a good thing, instead of adding some code in the presentation layer to do the transcodification; let the data live in the database, and the frontend take care of presentation only.
Since this table can be easily created when needed, a pragmatic approach would be to hold on until you have a real need for the representation table.
You should create the reference table in the database. You currently have business logic on the application side, interpreting data stored in the database. This seems dangerous.
What does "dangerous" mean? It means that ad-hoc queries on the database might need to re-implement the logic. That is prone to error.
It means that if you add a reporting front end, then the reports have to re-implement the logic. That is prone to error and a maintenance nightmare.
It means that if you have another developer come along, or another module implemented, then the logic might need to be re-implemented. Red flag.
The simplest solution is to have a reference table to define the official meanings of the codes. The application should use this table (via join) to return the strings. The application should not be defining the meaning of codes stored in the database. YAGNI doesn't apply, because the application is so in need of this information that it implements the logic itself.

BigQuery-Java: difference between QueryResponse and GetQueryResultsResponse

In sample code provided by Google, 2 classes are used to fetch results. QueryResponse and GetQueryResultsResponse.
I am not able to understand purpose of these 2 classes and do we have to use these 2 classes?
We are getting data from both: queryResponse.getRows() and queryResults.getRows()
I have gone through docs but could not figure out. what is the difference between these 2 classes and which is better to use?
Those two results are virtually identical (in fact, they are identical in the raw HTTP request). The difference is how you get them.
QueryResponse is returned by jobs.query(). This method can be used to run a query, but has only limited configuration options. It is intended as a convenience function. For more query options (such as setting a destination table, allowing large results, etc), use jobs.insert(). Another limitation of jobs.query() is that it may time out before the query has completed. Partly, this is because many clients (such as in AppEngine) require all HTTP requests to finish within 30 seconds or so. If jobs.query() times out, it will still report a job id that can be used to fetch the results with jobs.get_query_results().
GetQueryResultsResponse is returned by jobs.get_query_results(). This can be used to get the results of a query started by either jobs.query() or jobs.insert(). Query results (if you don't specify a destination table) are available for 24 hours after the query completes. jobs.get_query_results() allows you to fetch these results at any time. jobs.query() only gives you the query results once.
There is a further difference between the two, which is that jobs.query() just returns the first page of results. jobs.get_query_results() can be used to get multiple pages of results.
Hopefully this clarifies things a bit.

Getting specific Backbone.js models from a collection without getting all models first

I'm new to Backbone.js. I'm intrigued by the idea that you can just supply a URL to a collection and then proceed to create, update, delete, and get models from that collection and it handle all the interaction with the API.
In the small task management sample applications and numerous demo's I've seen of this on the web, it seems that the collection.fetch() is used to pull down all models from the server then do something with them. However, more often than not, in a real application, you don't want to pull down hundreds of thousands or even millions of records by issuing a GET statement to the API.
Using the baked-in connection.sync method, how can I specify parameters to GET specific record sets? For example, I may want to GET records with a date of 2/1/2014 or GET records that owned by a specific user id.
In this question, collection.find is used to do this, but does this still pull down all records to the client first then "finds" them or does the collection.sync method know to specify arguments when doing a GET to the server?
You do use fetch, but you provide options as seen in collection.fetch([options]).
So for example to obtain the one model where id is myIDvar:
collection.fetch(
{
data: { id: myIDvar },
success: function (model, response, options) {
// do a little dance;
}
};
My offhand recollections is that find, findWhere and where would invoke all models being downloaded and then the filtering taking place on the client. I believe with fetch the filtering takes places on the server side.
You can implement some kind of pagination on server side and update your collection with limited number of records. In this case all your data will be up to date with backend.
You can do it by overriding fetch method with you own implementaion, or specify params
For example:
collection.fetch({data: {page: 3})
You can also use find where method here
collection.findWhere(attributes)

database schema for http transactions

I have a script that makes a http call to a webservice, captures the response and parses it.
For every transaction, I would like to save the following pieces of data in a relational DB.
HTTP request time
HTTP request headers
HTTP response time
HTTP response code
HTTP response headers
HTTP response content
I am having a tough time visualizing a schema for this.
My initial thoughts were to create 2 tables.
Table 'Transactions':
1. transaction id (not null, not unique)
2. timestamp (not null)
3. type (response or request) (not null)
3. headers (null)
4. content (null)
5. response code (null)
'transaction id' will be some sort of checksum derived from combining the timestamp with the header text.
The reason why i compute this transaction id is to have a unique id that can distinguish 2 transactions, but at the same time used to link a request with a response.
What will this table be used for?
The script will run every 5 minutes, and log all this into the DB. Plus, every time it runs, the script will check the last time a successful transaction was made. Also, at the end of the day, the script generates a summary of all the transactions made that day and emails it.
Any ideas of how i can improve on this design? What kinda normalization and/or optimization techniques i should apply to this schema? Should i split this up into 2 or more tables?
I decided to use a NoSQL approach to this, and it has worked. Used MongDB. The flexibility it offers with document structure and not having to have a fixed number of attributes really helped.
Probably not the best solution to the problem, but i was able to optimize the performance using compound indexes.