# MongoDB 索引的重要性

由于集合中数据量比较大, 直接排序会超过MongoDB对单次操作的内存限制32M.

# createIndex()

WARNING

Removed in 5.0
db.collection.ensureIndex() has been replaced by db.collection.createIndex().

它用于在集合上创建索引，而 ensureIndex()在指定的上创建索引领域如果索引不存在。此外，当我们执行 createIndex() 时第二次执行两次将只是失败而与 ensureIndex()您可以多次调用它，它会不会失败

还有一件事，他们改变了 ensureIndex() 的行为。 , 在以前版本的 mongodb(版本小于 2.6)中，如果现有文档的索引条目超过最大索引键长度，则将创建索引但是 Mongodb 不会索引此类文档，而在最新版本中不会创建索引 .

{
  "_id": ObjectId("570c04a4ad233577f97dc459"),
  "score": 1034,
  "location": { "state": "NY", "city": "New York" }
}

1
2
3
4
5

The following operation creates an ascending index on the score field of the records collection:

db.records.createIndex({ score: 1 });

TIP

a value of 1 specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending order

# Create an Index on an Embedded Field

{
  "_id": ObjectId("570c04a4ad233577f97dc459"),
  "score": 1034,
  "location": { "state": "NY", "city": "New York" }
}

1
2
3
4
5

The following operation creates an index on the location.state field:

db.records.createIndex({ "location.state": 1 });

The created index will support queries that select on the field location.state, such as the following:

db.records.find({ "location.state": "CA" });
db.records.find({ "location.city": "Albany", "location.state": "NY" });

1
2

# Create an Index on Embedded Document

db.records.createIndex({ location: 1 });

# 复合索引

db.collection.createIndex( { <field1>: <type>, <field2>: <type2>, ... } )

WARNING

不能创建具有 hashed 索引类型的复合索引。如果试图创建包含 hashed 索引字段的复合索引，将收到一个错误。

{
 "_id": ObjectId(...),
 "item": "Banana",
 "category": ["food", "produce", "grocery"],
 "location": "4th Street Store",
 "stock": 4,
 "type": "cases"
}

1
2
3
4
5
6
7
8

以下操作在 item 和 stock 字段上创建一个升序索引：

db.products.createIndex({ item: 1, stock: 1 });

复合索引中列出的字段的顺序很重要。索引将包含对文档的引用，这些文档首先按 item 字段的值排序，然后在该字段的每个值内 item，按 stock 字段的值排序。

TIP

实际场景，使用最多的也是这类索引，在MongoDB中是满足所以能匹配符合索引前缀的查询，
例如已经存在db.products.createIndex({item: 1, stock: 1}) ，我们就不需要单独为db.products.createIndex({item: 1}) 建立索引，
因为单独使用item做查询条件的时候，也是可以命中db.products.createIndex({item: 1, stock: 1}) ，
但是使用stock单独作为查询条件的时候是不能匹配db.products.createIndex({item: 1, stock: 1})

# 排序顺序

WARNING

索引以升序（1）或降序（-1）排序顺序存储对字段的引用。

对于单字段索引，键的排序顺序无关紧要，因为 MongoDB 可以在任一方向上遍历索引。

但是，对于复合索引，属性的顺序决定了索引是否支持结果集的排序。

假设一个包含字段 username 和 date 的文档的集合事件。应用程序可以发出查询，返回的结果首先按升序 username 值排序，然后按降序(即从最近到最后)date 排序，例如:

db.events.find().sort({ username: 1, date: -1 });

或先按 username 降序再按 date 升序返回结果的查询，例如:

db.events.find().sort({ username: -1, date: 1 });

以下索引可以支持这两种排序操作

db.events.createIndex({ username: 1, date: -1 });

WARNING

但是，上面的索引不支持先升序 username 值再升序 date 值排序，例如:

db.events.find().sort({ username: 1, date: 1 });

# reIndex()

REINDEX 基于存储在表上的数据重建索引，替换旧的索引拷贝。使用 REINDEX 有两个主要原因：

索引崩溃，并且不再包含有效的数据。尽管理论上这是不可能发生的，但实际上索引会因为软件毛病或者硬件问题而崩溃。REINDEX 提供了一个恢复方法。要处理的索引包含大量无用的索引页未被回收。在某些情况下，这个问题会发生在 PostgreSQL 里面的 B-树索引上。REINDEX 提供了一个缩小索引空间消耗的方法，它采用的方法是写一个不带无用索引页的新版本的索引。参阅 Section 21.2 ``Routine Indexing'' 获取更多信息。

# TTL 索引

针对某个时间字段，指定文档的过期时间(用于仅在一段时间有效的数据存储，文档达到指定时间就会被删除，这样就可以完成自动删除数据) 这个删除操作是安全的，数据会选择在应用的低峰期执行，所以不会因为删除大量文件造成高额IO严重影响数据性能。

# 部分索引

给符合条件的数据文档建立索引，意在节约索引存储空间与写入成本

db.user.createIndex({sns.qq.openId:1})
/**
 * 给qq登录openid加索引，系统其实只有很少一部分用到qq登录 ，然后才会存在这个数据字段，这个时
 * 候就没有必要给所有文档加上这个索引，仅需要满足条件才加索引
 */
db.user.createIndex({sns.qq.openId:1} ,{partialFilterExpression:{$exists:1}})

1
2
3
4
5
6

# 稀疏索引

稀疏索引仅包含具有索引字段的文档条目，即使索引字段包含空值也是如此。索引会跳过缺少索引字段的所有文档。

db.user.createIndex({sns.qq.openId:1} ,{sparse:true})

TIP

3.2版本开始，提供了部分索引，可以当做稀疏索引的超集，官方推荐优先使用部分索引而不是稀疏索引。

# ESR索引规则

索引字段顺序： equal(精准匹配) > sort (排序条件)> range (范围查询)

精确(Equal)匹配的字段放最前面,排序(Sort)条件放中间,范围(Range)匹配的字段放最后面,也适用于ES,ER。

实际例子：获取成绩表中，高2班中数学分数大于120的学生，按照分数从大到小排序不难看出，班级和学科(数学)可以是精准匹配，分数是一个范围查询，同时也是排序条件那么按照ESR规则我们可以这样建立索引 {"班级":1,"学科":1,"分数":1}

# 我们怎么分析这个索引的命中与有效情况呢

看执行计划