SQL & Postgres Interview Concepts [closed] - sql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Introduction:
So, I have an interview tomorrow and I'm trying to review SQL and databases. The job posting says that they want someone with:
Experience with database design and development
Strong knowledge of SQL
Experience with SQL Server and/or Postgres
I've read through Questions every good database SQL developer should be able to answer, and a bunch of questions tagged with SQL and interview-questions. So I realize that I need to know about SELECT, JOIN and WHERE.
Questions:
What are essential SQL, Postgres and database concepts that I need to know in order to do well in the interview?
What do I need to know about transaction and normalization?
What are some general ways to optimize slow queries?
Should I learn about the functions, keywords or both?

It depends on how much of the role is based around database development and design. For your SQL syntax, you should also understand the difference between the types of joins, and be able to use GROUP BY, ORDER BY, HAVING as well as the aggregate functions that can be used in conjunction with them.
In terms of performance monitoring, I would be looking at execeution plans (not sure about the Postgres equivalent) and how they can provide tips on increasing performance, as well as using SQL Profiler to see what instructions the server is executing in real time.
Transactions can be useful for rolling back, well, transactions (stored procs, ad-hoc queries etc.) that require queries to complete in a certain way to maintain data consistency. Some people (myself included) have a practice of placing any statements that make any changes to data into a transaction that automatically rolls back (BEGIN TRAN ... ROLLBACK TRAN) to check that the correct amount of data is manipulated before pushing changes to a live server. Have a look at the ACID model - Atomicity, Consistency, Isolation, Durability.
Normalization is something that can take a little time to go through, but just know and partially understand up to 3rd form normalization and that will get you started.
Optimisation can be a huge topic. Just remember to try and do things like UPDATE using set based queries, rather than row based (updating in a WHILE loop is an example of row based updating, but it CAN have its uses).
I hope this helps a little.

Besides the basics of sql syntax, which you listed, you should know some things about query performance. What are some common causes of slow queries and what are the remedies for those, and how can you evaluate the performance of a query.

Related

Why NoSQL databases does not provide support for adhoc queries [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Suppose I have a table in a RDBMS having 26 columns, say A - Z.
With relational databases I can writes queries which invlove conditions on multiple columns. For example,
Select A, B
from table
where C > 12
and D = 'john'
and E between 3 and 6
order by F;
However, if I have the same table in a NoSQL database, all they provide is lookups based on primary keys, or some predefined GSI(taking dynamodb as example).
Although, I can issue a scan against the table in NoSQL db, but that is a lot slower as compared to a table in RDBMS even if the columns involved are not indexed.
I wanted to understand what are the reasons why NoSQL databases scale very well, but fail to provide a query language like SQL. Can someone throw some lightt on it?
You should be more specific about which database(s) you're asking about. You mention DynamoDB, but it's not clear in your question whether this is one example, or are you asking only about DynamoDB?
There are over 220 products that call themselves NoSQL, and they have different characteristics.
Some have an SQL-like language, some don't.
Some support queries to search by secondary attributes, some don't.
It's more a question of why a specific product didn't implement a SQL-like language, not a limitation of "NoSQL" as a broad category of products.
Your question is like asking "why don't non-motorcycles have a clutch?" The answer is that non-motorcycles is a broad category of vehicles, some of which actually do have a clutch, whereas some others were designed not to need a clutch.
No-SQL databases are designed on the premise that the data contained within them is schemaless. Thus, there is no pre-defined structure for the data which a database engine can easily use to determine how to execute an ad-hoc query. However, some no-sql database engines (e.g. Couchbase) do indeed offer such a capability.
The issue with database management systems in general has rarely been about storage and retrieval efficiency, but rather query plan optimization. In general, computers are not very good about dealing with issues created by poor designs. Also in general, most developers are not good at properly structuring data such that it can be queried quickly and easily by an automatically-generated query plan. Thus, most systems which rely upon automatically generated query plans tend to suffer performance issues.
In my opinion, the reason why a no-sql technology might not want to provide automatic query plan generation is that it forces the developer to give actual thought to the process of retrieving the data out, such that an efficient and effective plan might be devised in the code. Indeed, I have found that I am usually better at writing queries than the computer is. Could I restructure the data in such a way that the computer can write a good query plan the first time? Yes, but that takes more time than doing it myself to begin with.

Rails project without any SQL code - Every SQL is handled by Active Record [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
My question is more for practice than a debug issue.
At work, we use a Java-JEE/Oracle solution and the least I can say is we need to perform SQL query, anticipate SQL performances, handle SQL issues like foreign key or orphan line and so on.
So from my point of view, doing SQL is very important. For a new project, we are looking to implement the solution in Ruby on Rails. But most of the tutorial and code I see, seems to nest every Postgres SQL code under Active Record implementation. I have already experienced some similar issue with the Java Hibernate framework and its "no need SQL code." Some production issues were madness, the generated hidden SQL query were not easy to read and there is no deal with index or foreign key.
Any one can tell me what risk we have to use only Active Record ?
What is the proper process to avoid most common Ruby/SQL interface issues ?
When did you need to open your SQL console et type some SQL query ?
Share a little bit its experience on these points.
If you have any relevant link dealing with this topic.
Thank you very much !
You can still use sql.
Either low level, where you receive an array of arrays of values.
Or a little more high level, so you receive objects, with methods like find_by_sql.
Or by providing only sql-fragments, for example for the where-clause.
How often you need sql depends on your use case.
Ruby is about objects, sql is about tables. ActiveRecord handles objects as rows in a table. That works most of the time quite good. All simple queries are handles automatically. You can describe relations between objects, and even joins to retrieve these relations are handled.
For queries with several joins or group_by, it is sometimes easier to write the sql instead of instructing activerecord to build the sql you have in mind.
Also you need to have an eye on what sql is generated, as it is easy to write code that is inefficient, for example by generating many small sql statements.
The official Rails guides about "models" are the most important resource. From sql perspective you should have a look at "Active Record Query Interface"
http://guides.rubyonrails.org/active_record_querying.html
I also done a presentation about rails database optimisation, but it for rails 3.2 and a little out of date (joins are now better handled)
http://meier-online.com/en/2012/08/presentation-rails-database/

Why we need to use T-SQL over SQL when creating reports from Data Warehouse? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Can someone tell me why we need to use T-SQL over SQL when creating reports from Data Warehouse?
SQL also has functions and Joins but I see all of the online tutorials use T-SQL when creating reports from DW.
Can it be done with SQL? If T-SQL is must, could you please explain why? In terms of what can T-SQL do that SQL cannot.
Some useful tutorial links for T-SQL and creating reports would be great too!
Thanks in advance~
Regardless the fact T-SQL has more functionality than plain SQL, in general data warehousing you have two main approaches:
Put business logic closer to the data. This way you develop lots of T-SQL functions and apply many optimizations available there to improve performance of your ETL. Pros is greater performance of your ETL and reports. But cons are the cost of making changes to the code and migration cost. The usual case for growing DWH is migration to some of the MPP platforms. If you have lots of T-SQL code in MSSQL, you'll have to completely rewrite it, which will cost you pretty much money (sometimes even more than the cost of MPP solution + hardware for it)
Put business logic to the external ETL solution like Informatica, DataStage, Pentaho, etc. This way in database you operate with pure SQL and all the complex logic (if needed) is the responsibility of your ETL solution. Pros are simplicity of making changes (just move the transformation boxes and change their properties using GUI) and simplicity of changing the platform. The only con is the performance, which is usually up to 2-3x slower than in case of in-database implementation.
This is why you can either find a tutorial on T-SQL, or tutorial on ETL/BI solution. SQL is very general tool (many ANSI standards for it) and it is the basic skill for any DWH specialist, also ANSI SQL is much simpler as it does not have any database-specific stuff

Help finding old SQL tool that rewrote queries [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
There was this old sql server tool called Lectoneth or something like that, you'd put sql queries in it, and it would rewrite it for you.
I think quest bought them out, but I can't find where to download a free copy of that software.
Really helps when you have no dba, and have lots of sql queries to rewrite.
Thanks
Craig
Doesn't ring a bell, and presumably you've seen, but nothing obvious on Quest's website
Perhaps a tool like Red Gate's SQL Prompt would help - the Pro edition does SQL reformatting.
Edit
Think i've found what you're looking for, mentioned here - LECCO SQL Expert. The link to the Lecco website does indeed direct to quest, but a 404.
LECCO SQL Expert is the only complete
SQL performance tuning and
optimization solution offering
problematic SQL detection and
automatic SQL rewrite. With its
built-in Artificial Intelligence (AI)
based Feedback Searching Engine, LECCO
SQL Expert reduces the effort required
to optimize SQL and makes even the
most junior programmer an expert.
Developers use LECCO SQL Expert to
optimize SQL during application
development. DBAs eliminate
problematic SQL before users
experience application performance
problems by using LECCO SQL Expert in
production systems.
Looks like it's no longer about - all mentions of I could find indicated it supported up to SQL 2000, and stale links - looks like it wasn't a free tool. As said in my comments, I think this kind of thing is a skill well worth possessing and would benefit in the long run to not relay on a tool to try and do it for you.
I wasn't aware of this tool before now, so I have picked up something from this question - got me intrigued!
Final Update:
To confirm, that product has indeed gone as Lecco was acquired some years ago now. Thanks to Brent Ozar for confirmation.
I think you're looking for a product that's been merged into Toad for SQL Server. The commercial version of Toad has a SQL Optimizer feature that tries lots of ways to rewrite your SQL statements, then tests them to find which ways are the fastest.
You can download Toad here:
http://www.toadsoft.com/
But be aware that that feature is a paid-version-only feature.
Well rather than spending your time looking for a magic bullet, why not spend some time learning performance tuning (you will need a book, this is too complex for the Internet generally). Plus it is my belief that if you want ot write decent new code, you need to understand performance in databases. There is no reason to be unable to write code that avvoids the most common problems.
First, rewrite every query to use ANSII syntax anytime you open it up to revise it for any other reason. Code review all SQl changes and do not pass the code review unless explicit joins were used.
Your first step in performance tuning to identify which queries and procs are causing the trouble. You can use tools that will tell you the worst performing queries in terms of overall time, but don't forget to tune the queries that are run frequnetly as well. Cutting seconds off a query that runs thousands of times a day can really speed things up. Also since you are in oprod already, likely your users are complaining about certain areas, those areas should be looked at first.
Things to look for that cause performance problems:
Cursors
Correlated subqueries
Views that call views
Lack of proper indexing
Functions (especially scalar function that make the query run row by row insted of through a set)
Where clauses that aren't sargeable
EAV tables
Returning more data than you need (If you have anything with select * and a join, immediately fix that.)
Reusing sps that act on one record to loop throuhg a large group of records
Badly designed autogenerated complex queries from ORMs
Incorrect data types resulting in the need to be continually be converting data in order to use it.
Since you have the old style syntax it is highly likely you have a lot of accidental cross joins
Use of distinct when it can be replaced with a derived table instead
Use of union when Union all would work
Bad table design that requires difficult construction of queries that can never perform well. If you find yourself frequently joining to the same table multiple times to get the data you need, then look at the design of the tables.
Also since you have used implicit joins you need to be aware that even in SQL Server 2000 the left and right implicit syntax does not work correctly. Sometimes this interprets as a cross join instead of a left join or right join. I would make it a priority to find and fix all of these queries immediately as they may currently be returning an incorrect result set. Bad data results are even worse that slow data returns.
Good luck.

What simple guidelines would you give your developers for writing good SQL against Oracle? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I work in a group of about 25 developers. I'm responsible for coming up with the database design (tables, views, etc) and am called apon for performance tuning when necessary.
There are a couple of different applications that connect. Database access is via JDBC, hibernate, and iBatis SQL maps. Developers with various levels of experience write SQL statements.
What guidelines would you give to developers to write good SQL?
By good I mean: correct, performs well, easy to understand and maintain.
These are just meant to be easy to follow guidelines - I want to get people onto the right track for the majority of situations. We will break these guidelines when it makes sense.
EDIT: We have in place code reviews for all source commits (SQL, java, etc) enforced through a jira workflow.
If you have 25 developers writing SQL queries against your database you are in quite a bit of trouble. Guidelines are not worth much when your junior developers are learning SQL and checking in a mess.
I would like to offer 4 recommendations
Use an ORM of sorts so your all your devs write less SQL.
Invest in training, buy books, send people to courses.
Have all the SQL reviewed by the senior SQL developers, by all, I mean every SQL statement, no exceptions. This way your senior guys will be able to teach the juniors over time.
Have a single person, who lives and breaths Oracle, responsible for the database. By responsible I mean knows every query, understands all the structure and is able to give expert advice.
Here are some additional things you may add to your existing guidelines/checklist.
Have you tested your queries on a large data set? How was performance?
Have you performed a quick index review on the tables that are being accessed? Are all the right indexes in place? Do you recommend and new indexes?
For high volume queries, are any covering indexes required?
Are you using "NOT IN" in cases where a "LEFT JOIN" should be used?
Is your work transactionally sound? Are you missing a transaction somewhere?
Here's what I already have in my guidelines.
Work in sets, not row by row
The best way to make something go quicker is to avoid doing work you do not have to do
Databases love to join
Fully qualify and specify column names (so SQL does not break when additional columns are added)
Select only the data you need (never select *, never more rows than you require, never every column just becaues it's there)
How to use rownum to limit resultsets
Bind Variables vs Literals (use bind variables in all but a few special cases related to skewed data)
Avoid functions or calculations on columns in the WHERE clause (except for a special case of function based index)
Use ORDER BY for all queries returning more than one row (this is mostly for testability)
Each of these points is expanded a bit in the actual guidelines I've written out with an example relevant to our database schema.
Read Tom Kyte's books. He explains how you can write fast code and how you can measure performance and scalability. If you have a problem you can probably find the answer on the "ask tom"-site.
Introduce basic style guide that covers:
naming (of everything - tables, columns, procedures, aliases, ...) .
formatting style
line width
what reserved words require new line (e.g where)
are reserved word capitalized or small caps
indenting
...
Here are some examples:
Oracle PL/SQL Programming, Fourth Edition. There is older, 2nd edition - available online
SQL and PL/SQL Coding Standards
Be very strict about naming, it will be easier for you to read other people's code.
As formatting is concerned, there are tools available that can format automatically, so maybe you don't need very detailed description here.
If you are a database developer, you need to know what an EXECUTION PLAN is. If you don't then go mine coal or something.
Before developing:
first, you think what best EXECUTION PLAN will be,
second you create tables and indexes, and
third you use hints to persuade the optimizer to come out with the plan you made.
You do use hints. Forget automatic optimization, it's a marketing myth. No optimizer knows your data better than you and never will.
There are no "programmers who create queries" and "system administrators who create indexes". Programmers program, system administrators make backups (or whatever they make).
Triggers are evil.
Prefix you columns, tables and views (SELECT prs_name FROM t_person)
Make lines and indent
An hour long presentation on some Oracle fundamentals (eg parsing, SGA vs PGA). "Do this" rules may or may not apply to your situation. Give them an understanding of what the DB side does, and they at least have a basis on which to make a decision.
Plus Code reviews.
Pair-program. Any advantange it provides for agile development in general, at least doubles for SQL development.
Second choice, code reviews for all SQL.
Along with the recommendation to have queries reviewed by senior programmers, if you can get the buy-in, have code reviews which involve as many team members as possible.
I'm by no means a guru but here are my tips:
Don't use ORDER BY unless you really need an ordered list as it incurs a performance hit.
Understand the explain plan and also recognise that the plan on your development environment is often different from your production environment. Don't expect it to accurately reflect real life performance
The pros of using hints is that you get to choose your explain plan, the cons of using hints is that the optimal plan may change over time and you might be choosing a plan that is suboptimal in the long term
Make sure the developers know when to use INNER JOIN, OUTER JOIN, [NOT] IN, [NOT] EXISTS - you can put in place a lot of processes but one or two Cartesian products will bring production performance to its knees
Ensure your developers understand indexes - what they are, when they should be used, when they should be avoided
Have a DBA monitor the most executed queries and the most expensive queries and highlight these as candidates for optimisation
Peer review
Coding standards (especially code comments on particularly long/complex queries)
Unit testing
Don't write SQL if you can help it, use HQL (or JPQL is on Java EE) whenever possible
Don't use SELECT *
Pick your internet sources wisely (e.g. asktom.oracle.com)
Don't use cursors
Don't do string concatenation in SQL
Write queries such that they use indexes (fundamentally this means base WHERE predicates on the indexes that exist)
use MERGE instead of other awkward 'upsert' type logic
When working with dates, make sure you understand how they're stored in Oracle vs. how they are stored in Java, especially when it relates to TimeZone. Depending on the Calendar/Date types, this information can be stripped out, remapped to the TZ of the default locale, etc.
Most importantly: Don't use the excuse of being a developer for not knowing how to write good SQL, and how the database works. You don't have to be a DBA, but you need to invest in your own training to make yourself suitable for the task. By the same token, your company needs to invest in that as well.
I don't mean to say that these "Don'ts" always apply. It's just that, if you're talking about a developer who is not comfortable with Oracle, they need to know what they're doing before they start deciding whether those types of things are necessary and appropriate.