Development
Threat Intelligence

Don’t miss any vital info with File Indexing in OpenCTI

Dec 13, 2023 4 min read

OpenCTI provides multiple ways to search informations. You can use the global search, or search in elements listed in a specific views, or even filter lists to find group of relevant informations based on criteria.

But all theses searches are based on the data that have been structured into STIX Objects: entities like Threat Actors, Indicators, Organizations, or Observables like File, IP, or relationships between them.
What if the vital information you need has not been structured yet, and is still hidden in a file, in an unstructured text?

To meet this need, the OpenCTI team has introduced a new feature: full-text search in file content.


Indexing content of uploaded files to search it

The platform offers now a feature that allows you to extend the Global Search to the content of files that have been uploaded in OpenCTI. This feature can be applied to the content of documents that have been uploaded through the Data Import menu, or those that are directly linked to an entity via the Data tab of a specific Object.

This feature is especially beneficial when you want to ensure that you don’t overlook any crucial information that may not have been structured within the platform. Instances where this can occur include a situation where only a portion of the document content has been automatically imported, limitations imposed by a connector, and, naturally, errors that may have occurred during the manual processing of the data.

For this reason, it is highly recommended to enable the Full text indexing feature. This function ensures that all the relevant information in the documents is indexed and available for search, providing a comprehensive overview of the data and ensuring no critical information is missed.

Step-by-step guide

Step 1 : Configuring file indexing

File indexing can be configured via the File indexing tab in the Settings menu.

The configuration panel allows you to choose which type of file you want their content indexed, and you can also leverage options and thresholds to limit the impact of the indexing on your platform. For example, you can limit the indexing to PDF and txt file, under 5Mb and exclude file that have been uploaded without the context of an entity.

The information panel gives you insights into what will be indexed based on the chosen configuration.

Step 2: launch the first file indexation

Once you have initiated the process of indexing by clicking on the Start button, you will be able to monitor its progress in real time.

As the indexing process unfolds, you will have the option to temporarily Pause it if needed.

In addition to this, there is also a Reset button provided. Clicking on this button will restart the indexing process from the beginning. This operation will also remove all files that have previously been indexed from the database. It can be used in instances where you want to start the indexing process anew.

It’s also important to note that if you make any changes to the configuration while the file indexing is running, you may need to use the Reset button to ensure that the changes are properly implemented. This is because the new configuration might impact files that were not previously impacted under the old configuration.

Lastly, please note that the file indexing automatically runs every 5 minutes. This ensures that even the most recently uploaded files are properly indexed and ready for use.

Step 3: extend the global search to files

Upon completion of the indexing process, a Global Search can be executed, which then allows for an extensive search within the indexed files.

The Global Search results view shows 2 tabs. The “Knowledge search” tab contains results from the structured data in the platform. The “File search” tab contains results from the indexed files.

For convenience and further examination, the user has the option to open the document in a new tab. In addition, by clicking on the arrow at the end of the line, the user can directly navigate to the overview of the linked entity, providing a comprehensive understanding of the context of the document.

Conclusion

The introduction of full-text search in OpenCTI’s file content extends its search capabilities significantly. This enhancement allows for a thorough exploration of uploaded document contents, minimizing the risk of overlooking vital information.

If you have any question, request, comment or feedback to share with us, don’t hesitate to join us on Slack!

Stay up to date with everything at Filigran

Sign up for our newsletter and get bi-monthly updates of Filigran major events: product updates, upcoming events, latest content and more.