Your Data Lives Somewhere
- Databases, and the very definition of one, come in all shapes and sizes. But ultimately every definition has the same roots: a system with which data can be stored persistently and reliably. Persistence (it's not lost when the application closes or the machine shuts down) and reliability (you will be able to retrieve as needed and have some sort of understanding of what it is you're retrieving when doing so) are two adjectives that are absolutely critical to a database. Without them you have little more than nothing.
Virtually every application is doing to use a database at some point or another. We live in a data-driven world. When you go to Target and buy socks, that transaction is stored in a database. When you go to your physician's office for a check up, they store that in your EMR (electronic medical record), otherwise known as a database. Every time you cut a check out to your mortgage company or landlord and they deposit it to their bank, it's recorded in a transactions database.
If we were to pull every bit of data, from every organization/enterprise/mom and pop shop/healthcare facility/etc., for any given person over the age of a few second old, you would be inundated with information. Today it is a possibility (if you got your hands on enough warrants to pull the necessary data from every entity) to effectively map out your entire life, day by day, minute detail by minute detail. It's scary to think about, but the next application you're going to write may very well just contribute to that fact. So it's important you know just how critical databases are to the world.
- In the 1970s it became very clear just how much data we were storing. Now at this point in time it's very small, but keep things in perspective: technology as we know it today was just starting to get ready to kick off. As datasets became more complex it was realized that a better way of storing them was needed. Enter relational databases.
The relational model solved a lot of problems and quickly became the new de facto standard. Databases were finally able to modeled after real world data sets. Strict schemas could be set in place to ensure data reliability and consistency. Atomic transactions were realized (importance of which is not so important to us in this tutorial). Efficiency in interacting (querying) data was improved by drawing relations between the datasets. A database was finally able to effectively communicate "I have this employee, and he has many duties assigned to him, but I only have a single record on file for his start date (as I should)."
And this model has persisted as the go-to standard right through to today. Oracle Database, mic ros oft SQL Server and MySQL represent the top three database applications used across the world. The former two are almost-strictly seen in enterprise/businesses only. MySQL is the five hundred pound gorilla in the web and open source worlds. And it's what we're going to be getting our hands dirty with today!
- MySQL would come along in the mid-90s, just in time for the dot com boom (timing is everything). Between it being open source and free (as in free beer), MySQL quickly picked up popularity. The demands of early web applications brought about a simple realization: this was going to be a data-driven domain. And MySQL facilitated that by providing an RDBMS (Relational Database Management System) that early webmasters could take advantage of.
To this day MySQL remains in the top three of all RDBMSes. It is arguably only beat by Oracle's flagship database system, which is used by virtually every one of the Fortune 100 companies (pretty big deal).
- An RDBMS is useless if it cannot be interacted with. This is where SQL (Structured Query Language).
- Without SQL you have nothing but a service running on your PC or server. You can forge connections to it (hopefully) but you cannot chat with it. Basically you have allocated resources that you're gaining nothing out of. SQL fixes that.
It is through SQL that we do everything with our RDBMS. Creating a new user? It's done in SQL. Formatting output prior to it hitting your application? SQL does that too. Create a new relation to store data? Yep, SQL. Placing actual data in that relation? You get my point.
SQL and relational databases go hand in hand. Even more so than peanut butter and jelly. After all, peanut butter without jelly is still delicious.
- You'll use SQL with any and all relational database systems. It is specifically purposed for relational databases, so that's the extent to where you'll use it.
- This will vary for every reader. If you're a PHP web developer then you'll use an extension such as
We're going to be interacting with MySQL directly through the command line in this tutorial. Everything we learn can be translated one-for-one into the language of your choice. The queries we'll see and dissect are valid regardless of how you interact with MySQL. But by doing so without any frills we are able to focus solely on the language itself. This also alleviates any concerns of abstractions deterring you from learning the nuts and bolts, which is what this tutorial aims to do introduce you to.
or . Ruby on Rails developers typically use the object-relational mapping (ORM) gem. Virtually every language supports some form of interaction with MySQL, be it more direct (such as the MySQLi extension in PHP) or an ORM library/extension. Connectors even exist for languages such as C++, C# and Visual Basic. If it's a mainstream language then the tools exist to interact with MySQL.
- Let's quickly go over installation MySQL on win dows and Mac OS X. If you're using Linux then I would encourage you to install MySQL via your distro's package manager (Yum, Apt, etc.). But given most will be on win dows, and the few that aren't are likely on OS X, I'm going to focus on them specifically.
- For win dows your best bet is to grab the
If you want MySQL to start automatically, then be sure to install it as a service. And if you have the option to do so, choose to have it added to your PATH. That will make the rest of this tutorial go much smoother.
. It's a GUI installer, so you'll just need to follow the prompts.
- A couple options exist for you. If you have Homebrew installed and set up then you can simply (via terminal):
brew search mysql
- DMGs for MySQL can also be found here. If you do not have Homebrew set up then this is your best bet. It will also get you the latest version (5.6 at the time of writing this), so that may be preferable anyways.
- We have just a couple steps to accomplish before we can really start this guide.Start MySQLIf you're on win dows and installed MySQL as a service then this should already be done for you. But you can quickly check by doing this (open your command prompt and follow along):
- C:\Program Files\MySQL\bin\mysqld.exe statusChange the path there to wherever you installed MySQL to. The result should either be an error because MySQL isn't running, or success confirming it is. If you're told it's not running then issue the following:C:\Program Files\MySQL\bin\mysqld.exe startTo make the rest of this tutorial easier, just change your directory to the MySQL bin directory:cd C:\Program Files\MySQL\binChanging the path to be appropriate for your installation, of course. Now you can simply call the application instead of writing out the full path:mysqld statusORmysqld.exe status (if the above doesn't work)And while you're at it, add that path to your PATH directive so you don't have to constantly type it out.OS X users should simply try (in terminal):mysql.server statusIf an error is received because MySQL is not running then issue:mysql.server startIt's impossible for me to know every use case here for what may be wrong with any given OS, so please post below if you have issues with this step. And I'll flesh this section out with scenarios.LoginYour first time logging in will be with the root user. However you're accessing the mysqld application (as you did above to start it), do the following:
- mysql -u rootIf MySQL installed correctly and is running, you should see the following:There will be copyright information above it as well. But if you do see this then congratulations: you're connected to MySQL! Let's rocka nd roll.Initial Setup
- At this point you should be sitting in front of a MySQL prompt. We're now going to start using SQL. Specifically: we're going to create a database to play around in and a user to access that database! Let's start, shall we?
MySQL is the RDBMS. It's a type of database system. Within it are databases that contain relations that contain data. Think of a database as a container for all of your relations (tables). Databases are usually created on a one-to-one basis per the applications accessing them. So for example, your MyBB forum has a database and your Wordpress blog has its own. Both databases may be served by the same RDBMS instance, but they're separate from one another internally. We're going to create a database specific for our testing here. Type and execute (hit enter) the following:
If successful you should see this:
CREATE DATABASE hf_testing;
This is telling you that one database was created. But what exactly did we just do?
Statements and Functions: SQL Has Them in Spades
- Like most languages, SQL too has statements and functions. Functions in SQL may take parameters; statements may take flags, data types, labels or other tokens (to an extent, so too do functions). In the above example we used the CREATE DATABASE statement. There are a number of flags we could have set (IF NOT EXISTS, for example, to ensure we don't err out if the database already exists) or we could have designated the character set. But the above is perfectly acceptable and what you will see most commonly.
- CREATE DATABASE is the statement call and hf_testing is the label we're passing it. This basic syntax is seen all over SQL
- With the database created and stored in MySQL, we can now create a user specifically for it. Let's do that now:
CREATE USER 'hackforums'@'localhost' IDENTIFIED BY 'mypassword';You should have an OK result from this query, if done correctly. Again, we see the statement call, but this time it's CREATE USER. As the name describes, it's going to create a user with the information following it. We're passing the following:Code:'bytehax'@'localhost'On the left of the @ symbol we have the username. Right now you're using a user with the username of root. We just created one with the username of hackforums. It's a string type, so we encase it in quotations (single or double- I went single). Now the @'localhost' may look a little weird. But it's actually quite simple: we only want this user being able to access the database from the local machine and not remotely (outside our network). This does just that. You could change that to allow specific IP address, IP address range, wildcard, etc. But for our purposes, localhost is just fine.We also called another statement: IDENTIFIED BY with a string passed to it. This sets a password for the user. Change 'mypassword' to whatever you'd like. Remove the statement entirely if you want. But it's best practice to give every user a password.
- Before we can log out and log back in with our new user, we need to give it some permissions. Right now our user can do just about nothing. We need to fix that. We're going to pick and choose what rights this user needs for this tutorial. These statements that we're allowing don't need to make sense to you right now; just know that they'll allow you to follow along with this tutorial:
GRANT CREATE, INSERT, SELECT, UPDATE, DELETEON hf_testing.*TO 'hackforums'@'localhost';You'll notice I broke this statements out onto separate lines. You can do so, or just write them all on one line, separated by a space. I like to break longer queries out by statement, just to see the query better. And subsequently spot errors more easily. But that's a coding style preference that you'll come into your own on.In this query we are issuing the GRANT statement. It expects a list of statements that the receiving username should be allowed to perform. So we're allowing the CREATE, INSERT, SELECT, UPDATE and DELETE statements. We need all 5 of these for this tutorial.We then specify, via the ON statement, what database and relations these rights apply to. You'll notice we're passing the database we just created. And we're appending a .*. Why? Because that's the nomenclature to say "every relation belonging to this database." We could list specific relations in the same format (database.relation). But we don't need to be that granular here. Your mileage will vary as you progress into bigger and more complication databases.Lastly, we state who these rights are being applied to. And unsurprisingly, it's the user we just created, restricted to localhost still.And with that we have the user we'll use from here on out! Exit MySQL:Code:exit;And log back in:Code:mysql -u bytehax -pEntering the password you set at the prompt. You should now be sitting in front of the MySQL prompt once again, ready to go!
- So I'll level with you: most developers just use their root account. The few that don't only don't because they're forced not to by the likes of cPanel. This is unfortunate.
The root account has god-like privileges to the entire RDBMS instance. That is: every piece of data, in every relation, in every single database. And even beyond that, root has access to system setting, users, and other nitty-gritty details that you do NOT WANT AN ATTACKER HAVING ACCESS TO.
The stricter your permissions the better, in the event an attacker gain access to your MySQL server with said user. Will an application ever need to completely drop a relation (delete it)? If not, then don't allow the user that application accesses through be allowed to drop relations!
This should be common sense, but unfortunately few follow it. Make me proud and be one of the few that do.
- We're logged in with our new hackforums user and ready to roll? Good. I have formatted the rest of this tutorial to introduce the basic CRUD statements within SQL. "CRUD you say? What is CRUD!?"
These four actions represent the basics of any data-driven application. Data is constantly being stored, read, changed and removed. So let's take a look at how to do that in SQL.
- I like to format my guides with little stories. Maybe it's a distraction, maybe it helps. I don't care- I like to do it. And like the other guide I wrote where I did it as a story, we're going to following the work life of Joey the Database Administrator (he has a name now I guess...). He works for Company XYZ and answers to Mrs. CIO. Oh and we'll see some poor performance by Mr. CFO (it's entirely coincidental that I just chose green for the CFO) that will affect our story. Stay tuned!
- Joey gets a call one day that he has been chosen for a contract position with the new company in town. Mrs. CIO, already being lazy, tells Joey that his first task is to make it so the database can store not only the two employees they are hiring, but also the date they started. Work anniversaries are important, after all. Just that- HR will paper-trail the rest of their vitals, because that's so efficient (not really- it just makes it easier on me when writing this guide). Joey gets to work.
Relations (Also Known as Tables)
- We know that everything is encapsulated by databases. And we just recently created a database to store our stuff. What databases really store are relations; you will also see them called tables. Think of them as Excel spreadsheets: you have columns defining what the data is, and rows that contain that actual data.Relational schemas are hard-defined. That is, you create them and then you use them. And while using them all data being placed into the relation must fit within the schema's rules. Let's just create a relation and dig into it.USE hf_testing;We saw above how relations can be accessed via the database.relation_name nomenclature. Well typing that out all of the time gets boring, so we issue the USE statement. We can now reference relations within the database without explicitly typing the database name.CREATE TABLE employees ( emp_id INT UNSIGNED NOT NULL AUTO_INCREMENT, emp_first_name VARCHAR(16) NOT NULL, emp_last_name VARCHAR(20) NOT NULL, PRIMARY KEY(emp_id), UNIQUE(emp_first_name, emp_last_name));CREATE TABLE employee_start_dates ( emp_id INT UNSIGNED NOT NULL, emp_started_on DATETIME NOT NULL DEFAULT NOW(), UNIQUE(emp_id));
INSERT INTO employees(emp_first_name, emp_last_name)
INSERT INTO employee_start_dates(emp_id)
Data is inserted into a relation via the INSERT INTO statement. It expects the following:
INSERT INTO table_name(columns, to, be, used) VALUES(data,maps,to,columns)
You declare what columns you're inserting into, and then provide that many values. It's a pretty simple format. Do note: if you're ever inserting numeric data types: do not send them as strings. That is, don't insert numbers as: "12". Send it without the quotes. This is a best practice.
You do not need to explicitly declare and insert into every column. If a default value is set for the column then you're okay to omit it. In the above case, the employee's ID number is auto incrementing (a default value, as it's being set by MySQL itself) and the DATETIME in the start dates relation will default to that very second.
The only thing of real interest here is the LAST_INSERT_ID() function. It will take the ID generated (by auto increment, for example) of the last INSERT statement and return it. This way we don't need to ask MySQL what the ID number is for the new employee; it'll just occur seamlessly. A little on the advanced side as far as this tutorial shouldb e concerned, but worth seeing.
One issue you might see arising here, however, is mix and matching of employees and start dates if the database gets hammered with new hires. This is a legitimate concern; it breaks a lot of best practices that we should look to follow. I will not even begin to explain this bit of code here (not even going to put it in code tags), but just know MySQL does fix that, and that I will touch on this in a future guide:
This transaction guarantees both queries happen one after the other, and do not get screwed up in a hail storm of new hires. But as I said- this is just a teaser for a future guide.
When you're not creating new relations or tuples (rows/data sets), you're probably reading them. But we read a lot of different type of data from MySQL. Yes you want to be able to grab tuples (rows) out of a relation for using it in your application. But administering MySQL may require seeing what databases are in existence, or what tables are in the database you're using. Let's just take a quick look at this, then move on to working with our data sets again.
- Issues with the employee_start_dates table?
- SHOW Information
- Poor Joey forgot the name of the database he's working with. Oh, he can just look at all databases currently in existence on this database server? Awesome!SHOW DATABASES;Your results will very, but the above is typical of a fresh installation. This lists every single database MySQL knows about. You can see our hf_testing database listed there!Now little old Joey needs to know what relations he has in the database. He forgets things too quickly. So he does the following:SHOW TABLES;If you're still using the right database, and have been following along, you'll see the above.Neither of these really require any explanations. I think they both do that for themselves very well. Let's jump into working with real data again.
- We grab data out of relations by using the SELECT statement. In its most simplistic form:
SELECT emp_first_name, emp_last_nameFROM employees;The SELECT statement expects a list of columns to return to you. We're specifically looking at the first and last name of employees in the employees relation.The FROM statement defines that last bit for us. SELECT needs to know where to pull the data from. FROM tells it just that.Now SELECT can be used to do some rather cool things, when combined with MySQL functions. Like for example:Like the transaction above, I won't dig into this now. But just know more complicated SELECTs can be performed.As you can probably tell, the most basic format expected here is:SELECT column_a, column_b, column_c, etc.FROM table_nameBut we don't always want to grab every single record! Wouldn't it be nice to be able to set a condition where maybe we just grab any employees with a specific first name?Joey was asked to do just this when Mrs. CIO asked him to facilitate a directory. She wants him to build a query example that will return all first names of people with a specific last name. He gives her this:SELECT emp_first_nameFROM employeesWHERE emp_last_name = 'Jones'What's different here? Only one thing: we've added a WHERE statement. The WHERE statement sets conditions that need to be met for records in order to return them. Specifically, we're asking for all first names in employees where the last name is 'Jones'. We'll see one more operator used with a WHERE statement in this guide, but you should know a full suite of them exist (AND, OR, >, <, =, etc. etc.).Now Mrs. CIO comes back and asks Joey to show her all employees, ranked by when they were inserted into the database. He knows that means he won't need the WHERE statement any more, because he wants all records but sorted in a certain fashion. How will he do it? Simple:SELECT emp_first_name, emp_last_nameFROM employeesORDER BY emp_id ASCWe're already familiar with most of this. But it's the ORDER BY statement that is new. ORDER BY does just that- orders the returned records by the given row(s) and in the direction stated. In this case we're ordering by the ID number and we want it to go smallest-to-largest, so ASCending. Alternatively we could go DESC (descending).There you have the basics of pulling information out of MySQL!
- UPDATE Those Records!
- With John and Amanda working together such long hours at Company XYZ they ended up dating, and then.... married? Ah crap, that means Joey needs to change her last name. And he does so just like this:
Some new things here! First off, basic format:
- UPDATE table_nameSET column_name = new_valueOPTIONAL WHERESo we're saying "update employees and set the last name to Smith for a record that currently is Jones with a first name of Amanda". This accurately finds Amanda and changes her last name, as the SELECT query shows us!What's really of interest here is we've built on the WHERE statement we just learned about. The AND conditional means that both sides of it (the last name being Jones and the first name being Amanda) must both be true to pass the test. If either side return false then the whole query goes to shit and nothing happens.Congratulations John and Amanda... but bad news!
- Remember Mr. CFO? Well he sucked and invested all of the firm's capital in BTC prior to it plummeting, and now the company is under water. Which sucks, because Amanda is pregnant, John already has anxiety, and everybody just needs their paychecks. Oh well. Joey has one last duty: delete all two employee records from the database. They just aren't needed any more.
DELETE FROM employees WHERE emp_id > 0;
Poor John and Amanda. The WHERE clause here is optional, as it usually is. I simply added it to show you how you can use another operator (this time the greater-than operator).
Syntax of DELETE FROM:
DELETE FROM table_nameOPTIONAL WHERE
- If not WHERE statement is given then all rows are removed from the table. Forewarning, in case that's not your intention!
Challenge (because I didn't query and screenshot it myself, and don't feel like it, as I have been typing for over two hours...): delete all records from the start dates relation and show us how you did it.