Tuesday, 29 November 2016

MongoDB Basics I - Data Structures

Here's a quick look at some of the MongoDB basics:
  • Documents: The basic unit of data in MongoDB is a Document composed of Field and Value pairs which is equivalent to a row in a relational DB.
    Documents use JSON format but are stored internally as BSON
    A sample document structure:
    {
       _id: ObjectId("5087250df3f4948bd2f72351"),
       "field2": value2,
       ...
       fieldN: "valueN"
    }

    Fields are case-sensitive. Values are case and type-sensitive.
  • Collections: MongoDB stores Documents in Collections, which are analogous to tables in a relational DB but with a dynamic schema.
    Collections can be created explicitly or implicitly during an insert or index creation.
  • Databases: Collections are grouped into Databases. Databases are logical and physical. Each Database has it's own permissions and files in the filesystem. A MongoDB server can host multiple independent databases, each having its own collections.
    Databases can be created dynamically by using them
  • _id: Every document has an identifying key, "_id", that is unique within a collection.
Running the following in MongoDB's JavaScript shell will create a new Database and Collection (if they don't already exist) and a new Document:

use
myNewDB
db.myNewCollection1.insert( { x: 1 } )


As no _id field:value pair is specified here, it will also be created by default with an ObjectId. If we specify it instead, the field can be any data type but its value must be unique in the Collection.

To query Collections we use the find command.
db.collection.find(query, projection)

query and projection are optional parameters used to filter Documents and Fields respectively.

To return the _id of the Document we selected above (and any others where x=1) we would run:
db.myNewCollection.find({x: 1}, {_id: 1})

Documents can be embedded within parent Documents. Here, the name and contact fields are embedded Documents:
{
   _id: 1001
   name: { first: "Max", last: "Musterman" },
   contact: { phone: { type: "cell", number: "555-123-4567" } },
   fieldN: "valueN"
}

We reference fields within embedded Documents using dotted notation, in this case if it were the clients Collection we would use:
db.clients.name.last
or
db.clients.contact.phone


To query for all the Documents in the Collection for "Max Musterman" would run
db.clients.find({ name: { last: "Musterman", first: "Max" }})

Embedded Documents are useful for denormalising data and avoiding joins.



No comments:

Post a Comment