c++ - Search many strings over a very large text -
I have 2 million strings and I have to find each of them on 1 TB text data. Finding them all is not the best solution, so I was thinking of a better way to create data structures like TRAI for all strings. In other words, a trie in which each node is a word. I wanted to ask, is there a good algorithm, data structure or library (C ++) for this purpose?
Let me be more descriptive in friends with this question, for example
For example, I have these strings: s1- "I love you" s2- " How are you "s3-" what happened friend "
And I have a lot of text data: T1-" Hi, my name is Omid and I love computers. How are you guys? " 2- "Your every wish will be fulfilled, they tell me ..." T3T4 . T10000
Then I want to consider each scripture and I want to search for each of their strings. For this example I would say only this: T1 in S1 and nothing else. I am looking for an efficient way to search for stars, but not every time with stupidity all the time.
I apologize for posting only one link, but if you read any of the research papers If there is no problem, then there is definite references to string matching algorithms for me and by Simon Farrow and Thierry Lecrope, where they compare relative performance. No different string matches from algorithms. I am sure that there is a need among them that you need.
Comments
Post a Comment