Commit Graph

39 Commits

Author SHA1 Message Date
Cyberes 8c76455b60 be careful when reading elastic response 2024-02-12 16:56:52 -07:00
Cyberes b8c30e87d7 tidy up panic debug, tidy build 2024-02-12 16:47:05 -07:00
Cyberes 669f320b81 it's called a BLACKlist, shut the fuck up 2024-02-04 00:39:07 -07:00
Cyberes 5c3a646918 add some info to exe 2024-02-03 20:05:48 -07:00
Cyberes e1e6e2cbc2 fix some data races 2024-02-03 19:46:07 -07:00
Cyberes 665a2e8c18 reorganize admin crawl info json, clarify debug message 2024-01-23 15:11:15 -07:00
Cyberes ba2747c186 fix a couple panics 2024-01-23 14:59:48 -07:00
Cyberes e000d29aa5 adjust admin crawls info 2024-01-23 12:31:23 -07:00
Cyberes 4a74b00d46 take into account the walk function when checking if a crawl is already running for a path 2024-01-23 12:23:24 -07:00
Cyberes 11edbeadc3 reorganize data on admin pages, adjust global vars 2024-01-23 12:10:03 -07:00
Cyberes 7004e3935c adjust logging, catch rare ResponseItem panic for later 2024-01-23 11:49:34 -07:00
Cyberes d925847734 fix kernal panic 2024-01-23 10:41:01 -07:00
Cyberes 3af85db036 fix wrong key being used for delete 2023-12-18 12:02:30 -07:00
Cyberes 39513ffc36 reorder some logic 2023-12-13 22:51:13 -07:00
Cyberes 0ee56f20e7 improve startup sequence, fix the elastic key format 2023-12-13 20:09:02 -07:00
Cyberes 0469529f54 rebuild 2023-12-13 14:49:04 -07:00
Cyberes cd4364436a minor cleanup 2023-12-13 14:43:57 -07:00
Cyberes 631844fb98 fix panic 2023-12-13 14:31:00 -07:00
Cyberes e9db83f09b move elastic sync to workers instead of threads, parallel elastic delete sync, reimplement partial elastic sync 2023-12-13 14:21:47 -07:00
Cyberes d16eaf614e fix mior tihgs 2023-12-12 18:00:43 -07:00
Cyberes 8d08f04a4f add crawl type indicator to admin status page 2023-12-12 17:52:03 -07:00
Cyberes a39b3ea010 reorganize 2023-12-12 17:30:47 -07:00
Cyberes 6377b8b6bc move elastic crawlers to workers 2023-12-12 17:26:39 -07:00
Cyberes 17c96e45c3 remove profiling 2023-12-12 00:54:50 -07:00
Cyberes 82636792ea dealing with memory usage 2023-12-11 23:45:09 -07:00
Cyberes 72e6355869 reorganize HTTP routes, improve JSON response 2023-12-11 22:36:41 -07:00
Cyberes 112ab0e08f clarify log item 2023-12-11 21:37:16 -07:00
Cyberes 2579c76f04 fix memory usage related to the worker queue size, reorganize things 2023-12-11 21:35:44 -07:00
Cyberes b5327e0c67 update todo 2023-12-11 19:12:00 -07:00
Cyberes 157f80a463 make workers global, fix worker setup, clean up 2023-12-11 18:50:30 -07:00
Cyberes 7078712bc3 track running crawls and add an admin page, use basic auth for admin, reject crawl if already running for a path, limit max directory crawlers, fix some issues 2023-12-11 18:05:59 -07:00
Cyberes a96708f6cf fix error when a file in the cache is not found on the disk 2023-12-11 16:18:12 -07:00
Cyberes 634f3eb8ea fix download encoding, redo config passing, 2023-12-11 15:29:34 -07:00
Cyberes 4b9c1ba91a Merge dev to master 2023-12-08 22:25:59 -07:00
Cyberes 627f4d2069 limit max workers 2023-07-20 13:06:19 -06:00
Cyberes 4e9d3265fd refactor and performance improvements 2023-07-20 13:06:07 -06:00
Cyberes f40907dd8a add sorting arg to /list and /search with option to sort folders first
fix crawler on /list
fix json encoding empty children array to null on /list
fix recache
2023-07-18 10:58:29 -06:00
Cyberes fabe432ac4 should be pretty good! 2023-07-17 23:20:21 -06:00
Cyberes 2bead0284c Initial commit 2023-07-13 13:54:42 -06:00