Each data type has a number and string alias that can be used with the $type operator to query documents by BSON type. Additionally, there is a "number" alias that corresponds to all the numeric data-types.
db.addresses.find({"areaCode":{$type:"number"}}) This will find all the Documents in the addresses Collection where areaCode is numeric
db.addresses.find({"areaCode":{$type:"string"}}) This will find all the Documents in the addresses Collection where areaCode is text
Here's a quick look at some of the MongoDB basics:
Documents: The basic unit of data in MongoDB is aDocument composed of Field and Value pairs which is equivalent to a row in a relational DB. Documents use JSON format but are stored internally as BSON A sample document structure: { _id: ObjectId("5087250df3f4948bd2f72351"), "field2": value2, ... fieldN: "valueN" } Fields are case-sensitive. Values are case and type-sensitive.
Collections: MongoDB stores Documents in Collections, which are analogous to tables in a relational DB but with a dynamic schema. Collections can be created explicitly or implicitly during an insert or index creation.
Databases: Collections are grouped into Databases. Databases are logical and physical. Each Database has it's own permissions and files in the filesystem. A MongoDB server can host multiple independent databases, each having its own collections. Databases can be created dynamically by using them
_id: Every document has an identifying key, "_id", that is unique within a collection.
Running the following in MongoDB's JavaScript shell will create a new Database and Collection (if they don't already exist) and a new Document:
usemyNewDB db.myNewCollection1.insert({x:1})
As no _id field:value pair is specified here, it will also be created by default with an ObjectId. If we specify it instead, the field can be any data type but its value must be unique in the Collection.
To query Collections we use the find command. db.collection.find(query, projection)
query and projection are optional parameters used to filter Documents and Fields respectively.
To return the _id of the Document we selected above (and any others where x=1) we would run: db.myNewCollection.find({x: 1}, {_id: 1})
Documents can be embedded within parent Documents. Here, the name and contact fields are embedded Documents: { _id: 1001 name:{first:"Max",last:"Musterman"}, contact:{phone:{type:"cell",number:"555-123-4567"}}, fieldN: "valueN" }
We reference fields within embedded Documents using dotted notation, in this case if it were the clients Collection we would use: db.clients.name.last ordb.clients.contact.phone
To query for all the Documents in the Collection for "Max Musterman" would run db.clients.find({ name: { last: "Musterman", first: "Max" }})
Embedded Documents are useful for denormalising data and avoiding joins.
Our company was making a technical presentation at this event, but I was not involved in presenting and was there simply as an attendee.
Overall, I didn't find MongoDB Europe as valuable as other such conferences (eg IDUG DB2 or Oracle World). I think it would be greatly improved by extending it to at least three days and making all the additional content user presentations.
The keynote address by Prof Brian Cox was not very revelant and just barely touched on the tenuous link to MongoDB ... that some observatory data was hosted in MongoDB. Other than that, his talk had nothing to do with MongoDB and was a real waste of an hour in a one-day event.
The vendor presentations focused quite heavily on pushing their cloud solution, Atlas, and less on the release of MongoDB 3.4 than I had hoped.
Recordings of the presentations for "Shard 1" are available online:
M202: MongoDB Advanced Deployment and Operations is an advanced course for operations staff and DBAs. It gives a much better understanding of MongoDB concepts and is backed-up by much more hands-on work than the introductory M102 Course
The course lectures are provided via YouTube videos as normal with MongoDB University and the practical side is performed on a provided VM installation.
I took the August 2016 release of the course and passed with a 100% grade.
Chapter 1: System Sizing and Tuning
Installing your VMs, MongoDB's use of memory, pre-heating data, spinning disks, SSDs, RAID, network storage, swap space, readahead, MongoDB CPU and disk usage
Chapter 2: Backup Options and Disaster Recovery
Disaster recovery requirements, assessing tolerance for data loss, assessing tolerance for downtime, disaster recovery in sharded clusters, backup strategies
Chapter 3: Fault Tolerance and Availability Rolling maintenance
Reading from secondaries, driver options, connection management, read preferences, rollback
Chapter 4: Sharded Cluster Management
Scaling out, config servers, periodic maintenance, the mongos process, chunks and splitting, pre-splitting data, the balancer, migration, tag-based sharding, hash-based sharding, unbalanced chunks, orphaned chunks, removing a shard
My first introduction to MongoDB was to sign up to university.mongodb.com and take the course M102: MongoDB for DBAs. This is the basic course for DBAs and while I wouldn't say it made me feel production-ready, it demystified MongoDB and gave me a good overview of JSON and the various MongoDB concepts.
The course lectures are provided via YouTube videos as normal with MongoDB University and the practical side is performed on a personal MongoDB installation.
I installed the latest version of MongodDB version 3.2.6 for Windows although the course recommended 3.2.2. However, I didn't have any problems related to the version.
I did encounter some problems due to starting MongoDB without the necessary permissions. This comes from how you start the Windows command line interface, cmd.exe. Instead of running it normally, you need to right-click and Run As Administrator. Then when you start your MongoDB processes they will function correctly. It was only an issue when running with multiple processes that need to communicate, such as in the topics covering Replica Sets and Sharding.
I took the May 2016 release of the course and passed with a 100% grade.
The questions were all quite straightforward and covered in the online course material.
Chapter 1: Introduction
Introduction to MongoDB, key concepts and installing Mongo
Homework 1.1
What do you get as a result?
Homework 1.2
What's the result?
Homework 1.3
Now, what query would you run to get all the products where brand equals the string "ACME"?
Homework 1.4
Check all that apply:
var c = db.products.find( { }, { name : 1, _id : 0 } ).sort( { name : 1 } ); while( c.hasNext() ) { print( c.next().name); }
var c = db.products.find( { } ).sort( { name : 1 } ); c.forEach( function( doc ) { print( doc.name ) } );
Chapter 2: CRUD and Administrative Commands
Creating, reading and updating data
Homework 2.1
What is the output? (The above will check that products_bak is populated.)
Homework 2.2
What is the output?
Homework 2.3
How many products have a voice limit? (That is, have a voice field present in the limits subdocument.)
Chapter 3: Performance
Indexing and monitoring
Homework 3.1
When you are done, run:
homework.a()
and enter the numeric result below (no spaces).
Homework 3.2
Once you have eliminated the slow operation, run (on your second tab):
homework.c()
and enter the output below. Once you have it right and are ready to move on, ctrl-c (terminate) the shell that is still running the homework.b() function.
Homework 3.3
Q1: How many products match this query?
Q2: Run the same query, but this time do an explain(). How many documents were examined?
Q3: Does the explain() output indicate that an index was used?
Check all that apply:
Which of the following are available in WiredTiger but not in MMAPv1? Check all that apply.
Chapter 5: Replication Part 2
Optimizing and monitoring your Replica Sets
Homework 5.1
what is the text in the "state" field for the arbiter when you run rs.status()?
Homework 5.2
Which of the following options will allow you to ensure that a primary is available during server maintenance, and that any writes it receives will replicate during this time?
Homework 5.3
You only have two data centers available. Which arrangement(s) of servers will allow you to be stay up (as in, still able to elect a primary) in the event of a failure of either data center (but not both at once)? Check all that apply.
Homework 5.4
Find out the optional parameter that you'll need, and input it into the box below for your rs.reconfig(new_cfg, OPTIONAL PARAMETER).
Chapter 6: Scalability
Sharding setup, sharding monitoring, shard key selection, inserting large amounts of data
Homework 6.1
Run homework.a() and enter the result below. This method will simply verify that this simple cluster is up and running and return a result key.
Homework 6.2
Run homework.b() to verify the above and enter the return value below.
Homework 6.3
When done, run homework.c() and enter the result value.
Chapter 7: Backup and Recovery
Security, backups and restoring for backups
Final Exam:
Question 1:
How many documents do you have?
Question 2:
Question: Which of the following are true about mongodb's operation in these scenarios? Check all that apply.
Check all that apply.
Choose the best answer:
Reconfigure the replica set so that the third member can never be primary. Then run:
$ mongo --shell a.js --port 27003
And run:
> part4()
And enter the result in the text box below (with no spaces or line feeds just the exact value returned).
Once you have the config server running, confirm the restore of the config server data by running the last javascript line below in the mongo shell, and entering the 5 character result it returns.
Connect to the mongos with a mongo shell. Run this:
use snps
var x = db.elegans.aggregate( [ { $match : { N2 : "T" } } , { $group : { _id:"$N2" , n : { $sum : 1 } } } ] ).next(); print( x.n )
Enter the number output for n.
Based on the explain output, which of the following statements below are true?
I am a DBA with 20 years experience of RDMSs from DB2 V5 on mainframes to Oracle and MariaDB on unix.
My company is now jumping on the NoSQL bandwagon and we are adding MongoDB to our portfolio. We are actually quite far along this path already and are one of the largest MongoDB shops in Europe. The problem we face now is that we don't have any MongoDB DBAs in Ops to support it all and our recruiters seemingly can't find any MongoDB DBAs out there to recruit. So responsibility for maintenance is being handed to the SQL DBAs! Problem solved!!
I plan to use this Blog to document my personal MongoDB journey, and the mistakes we will no doubt make along the way.