Merge dev to master
This commit is contained in:
parent
627f4d2069
commit
4b9c1ba91a
|
@ -0,0 +1,20 @@
|
|||
# Elasticsearch Integration
|
||||
|
||||
A background thread syncs the cache with Elastic, rather than sync during the crawl. This is done so that the crawl
|
||||
is not slowed down and the webserver can start serving clients sooner. It may take hours to sync with Elastic, so it is
|
||||
better to run it as a background task.
|
||||
|
||||
There are two types of syncs: new and refresh. The "new" sync adds new files not already in Elastic and deletes files
|
||||
that are in Elastic but no longer in the cache. The "refresh" sync is a full sync and pushes every file to Elastic.
|
||||
|
||||
The intervals of these syncs are controlled by `elasticsearch_sync_interval` and `elasticsearch_full_sync_interval`.
|
||||
By default, only one sync job can run at a time but setting `elasticsearch_allow_concurrent_syncs` to `true` allows both
|
||||
to run at once.
|
||||
|
||||
On startup, a "new" sync is run. You can run a "refresh" sync by setting `elasticsearch_full_sync_on_start` to `true`.
|
||||
|
||||
Why we don't store the cache in Elasticsearch? Because Elastic is not as fast as fetching things from RAM.
|
||||
|
||||
### Searching
|
||||
|
||||
We do an Elastic [simple query string search](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html).
|
52
README.md
52
README.md
|
@ -1,38 +1,51 @@
|
|||
TODO: add a "last modified" to "sort"
|
||||
in <https://chub-archive.evulid.cc/api/file/list?path=/chub.ai/characters&page=1&limit=50&sort=folders>
|
||||
|
||||
TODO: add an admin endpoint to fetch the last n modified files. Maybe store files update time in elasticsearch?
|
||||
|
||||
TODO: fix the 3 loading placeholders
|
||||
|
||||
TODO: <https://github.com/victorspringer/http-cache>
|
||||
|
||||
TODO: fix encoding on https://chub-archive.evulid.cc/api/file/download?path=/other/takeout/part1.md
|
||||
|
||||
TODO: fix /api/file/download when an item is in the cache but does not exist on the disk
|
||||
|
||||
# crazy-file-server
|
||||
|
||||
_A heavy-duty web file browser for CRAZY files._
|
||||
*A heavy-duty web file browser for CRAZY files.*
|
||||
|
||||
The whole schtick of this program is that it caches the directory and file structures so that the server doesn't have to
|
||||
re-read the disk on every request. By doing the processing upfront when the server starts along with some background
|
||||
scans to keep the cache fresh we can keep requests snappy and responsive.
|
||||
|
||||
I needed to serve a very large dataset full of small files publicly over the internet in an easy to browse website. The
|
||||
existing solutions were subpar and I found myself having to create confusing Openresty scripts and complex CDN caching
|
||||
to keep things responsive and server load low. I gave up and decided to create my own solution.
|
||||
|
||||
The whole schtick of this program is that it caches the directory and file structures so that the server doesn't have to re-read the disk on every request. By doing the processing upfront when the server starts along with some background scans to keep the cache fresh we can keep requests snappy and responsive.
|
||||
You will likely need to store your data on an SSD for this. With an SSD, my server was able to crawl over 6 million
|
||||
files stored in a very complicated directory tree in just 5 minutes.
|
||||
|
||||
|
||||
|
||||
I needed to serve a very large dataset full of small files publicly over the internet in an easy to browse website. My data was mounted over NFS so I had to take into account network delays. The existing solutions were subpar and I found myself having to create confusing Openresty scripts and complex CDN caching to keep things responsive and server load low. I gave up and decided to create my own solution.
|
||||
|
||||
|
||||
|
||||
**Features**
|
||||
## Features
|
||||
|
||||
- Automated cache management
|
||||
- Optionally fill the cache on server start, or as requests come in.
|
||||
- Watch for changes or scan interval.
|
||||
- Optionally fill the cache on server start, or as requests come in.
|
||||
- Watch for changes or scan interval.
|
||||
- File browsing API.
|
||||
- Download API.
|
||||
- Restrict certain files and directories from the download API to prevent users from downloading your entire 100GB+ dataset.
|
||||
- Restrict certain files and directories from the download API to prevent users from downloading your entire 100GB+
|
||||
dataset.
|
||||
- Frontend-agnostic design. You can have it serve a simple web interface or just act as a JSON API and serve files.
|
||||
- Simple resources. The resources for the frontend aren't compiled into the binary which allows you to modify or even replace it.
|
||||
- Simple resources. The resources for the frontend aren't compiled into the binary which allows you to modify or even
|
||||
replace it.
|
||||
- Basic searching.
|
||||
- Elasticsearch integration (to do).
|
||||
|
||||
|
||||
## Install
|
||||
|
||||
1. Install Go.
|
||||
2. Download the binary or do `cd src && go mod tidy && go build`.
|
||||
|
||||
|
||||
|
||||
## Use
|
||||
|
||||
1. Edit `config.yml`. It's well commented.
|
||||
|
@ -40,6 +53,9 @@ I needed to serve a very large dataset full of small files publicly over the int
|
|||
|
||||
By default, it looks for your config in the same directory as the executable: `./config.yml` or `./config.yaml`.
|
||||
|
||||
If you're using initial cache and have tons of files to scan you'll need at least 5GB of RAM and will have to wait 10 or so minutes for it to traverse the directory structure. CrazyFS is heavily threaded so you'll want at least an 8-core machine.
|
||||
If you're using initial cache and have tons of files to scan you'll need at least 5GB of RAM and will have to wait 10 or
|
||||
so minutes for it to traverse the directory structure. CrazyFS is heavily threaded so you'll want at least an 8-core
|
||||
machine.
|
||||
|
||||
The search endpoint searches through the cached files. If they aren't cached, they won't be found. Enable pre-cache at startup to cache everything.
|
||||
The search endpoint searches through the cached files. If they aren't cached, they won't be found. Enable pre-cache at
|
||||
startup to cache everything.
|
||||
|
|
|
@ -0,0 +1,94 @@
|
|||
package CacheItem
|
||||
|
||||
import (
|
||||
"crazyfs/config"
|
||||
"crazyfs/file"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"time"
|
||||
)
|
||||
|
||||
func NewItem(fullPath string, info os.FileInfo) *Item {
|
||||
if !strings.HasPrefix(fullPath, config.RootDir) {
|
||||
// Retard check
|
||||
log.Fatalf("NewItem was not passed an absolute path. The path must start with the RootDir: %s", fullPath)
|
||||
}
|
||||
|
||||
if config.CachePrintNew {
|
||||
log.Debugf("CACHE - new: %s", fullPath)
|
||||
}
|
||||
|
||||
pathExists, _ := file.PathExists(fullPath)
|
||||
if !pathExists {
|
||||
if info.Mode()&os.ModeSymlink > 0 {
|
||||
// Ignore symlinks
|
||||
return nil
|
||||
} else {
|
||||
log.Warnf("NewItem - Path does not exist: %s", fullPath)
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
var mimeType string
|
||||
var ext string
|
||||
var err error
|
||||
if !info.IsDir() {
|
||||
var mimePath string
|
||||
if config.FollowSymlinks && info.Mode()&os.ModeSymlink > 0 {
|
||||
mimePath, _ = filepath.EvalSymlinks(fullPath)
|
||||
} else {
|
||||
mimePath = fullPath
|
||||
}
|
||||
if config.CrawlerParseMIME {
|
||||
_, mimeType, ext, err = file.GetMimeType(mimePath, true, &info)
|
||||
} else {
|
||||
_, mimeType, ext, err = file.GetMimeType(mimePath, false, &info)
|
||||
|
||||
}
|
||||
if os.IsNotExist(err) {
|
||||
log.Warnf("Path does not exist: %s", fullPath)
|
||||
return nil
|
||||
} else if err != nil {
|
||||
log.Warnf("Error detecting MIME type: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
// Create pointers for mimeType and ext
|
||||
var mimeTypePtr, extPtr *string
|
||||
if mimeType != "" {
|
||||
mimeTypePtr = &mimeType
|
||||
}
|
||||
if ext != "" {
|
||||
extPtr = &ext
|
||||
}
|
||||
|
||||
return &Item{
|
||||
Path: file.StripRootDir(fullPath),
|
||||
Name: info.Name(),
|
||||
Size: info.Size(),
|
||||
Extension: extPtr,
|
||||
Modified: info.ModTime().UTC().Format(time.RFC3339Nano),
|
||||
Mode: uint32(info.Mode().Perm()),
|
||||
IsDir: info.IsDir(),
|
||||
IsSymlink: info.Mode()&os.ModeSymlink != 0,
|
||||
Cached: time.Now().UnixNano() / int64(time.Millisecond), // Set the created time to now in milliseconds
|
||||
Children: make([]string, 0),
|
||||
Type: mimeTypePtr,
|
||||
}
|
||||
}
|
||||
|
||||
type Item struct {
|
||||
Path string `json:"path"`
|
||||
Name string `json:"name"`
|
||||
Size int64 `json:"size"`
|
||||
Extension *string `json:"extension"`
|
||||
Modified string `json:"modified"`
|
||||
Mode uint32 `json:"mode"`
|
||||
IsDir bool `json:"isDir"`
|
||||
IsSymlink bool `json:"isSymlink"`
|
||||
Type *string `json:"type"`
|
||||
Children []string `json:"children"`
|
||||
Content string `json:"content,omitempty"`
|
||||
Cached int64 `json:"cached"`
|
||||
}
|
|
@ -0,0 +1,12 @@
|
|||
package CacheItem
|
||||
|
||||
import (
|
||||
"crazyfs/logging"
|
||||
"github.com/sirupsen/logrus"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
}
|
|
@ -0,0 +1,93 @@
|
|||
package ResponseItem
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/logging"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"github.com/sirupsen/logrus"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
}
|
||||
|
||||
type ResponseItem struct {
|
||||
Path string `json:"path"`
|
||||
Name string `json:"name"`
|
||||
Size int64 `json:"size"`
|
||||
Extension *string `json:"extension"`
|
||||
Modified string `json:"modified"`
|
||||
Mode uint32 `json:"mode"`
|
||||
IsDir bool `json:"isDir"`
|
||||
IsSymlink bool `json:"isSymlink"`
|
||||
Type *string `json:"type"`
|
||||
Children []*CacheItem.Item `json:"children"`
|
||||
Content string `json:"content,omitempty"`
|
||||
Cached int64 `json:"cached"`
|
||||
}
|
||||
|
||||
func NewResponseItem(cacheItem *CacheItem.Item, sharedCache *lru.Cache[string, *CacheItem.Item]) *ResponseItem {
|
||||
item := &ResponseItem{
|
||||
Path: cacheItem.Path,
|
||||
Name: cacheItem.Name,
|
||||
Size: cacheItem.Size,
|
||||
Extension: cacheItem.Extension,
|
||||
Modified: cacheItem.Modified,
|
||||
Mode: cacheItem.Mode,
|
||||
IsDir: cacheItem.IsDir,
|
||||
IsSymlink: cacheItem.IsSymlink,
|
||||
Cached: cacheItem.Cached,
|
||||
Children: make([]*CacheItem.Item, len(cacheItem.Children)),
|
||||
Type: cacheItem.Type,
|
||||
}
|
||||
|
||||
// Grab the children from the cache and add them to this new item
|
||||
if len(cacheItem.Children) > 0 { // avoid a null entry for the children key in the JSON
|
||||
var children []*CacheItem.Item
|
||||
for _, child := range cacheItem.Children {
|
||||
childItem, found := sharedCache.Get(child)
|
||||
|
||||
// Do a quick crawl since the path could have been modfied since the last crawl.
|
||||
// This also be triggered if we encounter a broken symlink. We don't check for broken symlinks when scanning
|
||||
// because that would be an extra os.Lstat() call in processPath().
|
||||
if !found {
|
||||
log.Debugf("CRAWLER - %s not in cache, crawling", child)
|
||||
|
||||
dc := DirectoryCrawler.NewDirectoryCrawler(sharedCache)
|
||||
item, err := dc.CrawlNoRecursion(filepath.Join(config.RootDir, child))
|
||||
|
||||
if err != nil {
|
||||
log.Errorf("NewResponseItem - CrawlNoRecursion - %s", err)
|
||||
continue // skip this child
|
||||
}
|
||||
if item == nil {
|
||||
log.Debugf("NewResponseItem - CrawlNoRecursion - not found %s - likely broken symlink", child)
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
copiedChildItem := &CacheItem.Item{
|
||||
Path: childItem.Path,
|
||||
Name: childItem.Name,
|
||||
Size: childItem.Size,
|
||||
Extension: childItem.Extension,
|
||||
Modified: childItem.Modified,
|
||||
Mode: childItem.Mode,
|
||||
IsDir: childItem.IsDir,
|
||||
IsSymlink: childItem.IsSymlink,
|
||||
Cached: childItem.Cached,
|
||||
Children: nil,
|
||||
Type: childItem.Type,
|
||||
}
|
||||
children = append(children, copiedChildItem)
|
||||
}
|
||||
item.Children = children
|
||||
}
|
||||
|
||||
return item
|
||||
}
|
|
@ -1,40 +1,39 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/elastic"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
func AdminCacheInfo(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
func AdminCacheInfo(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
auth := r.URL.Query().Get("auth")
|
||||
if auth == "" || auth != cfg.HttpAdminKey {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusForbidden)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 403,
|
||||
"error": "access denied",
|
||||
})
|
||||
helpers.Return403Msg("access denied", w)
|
||||
return
|
||||
}
|
||||
|
||||
cacheLen := sharedCache.Len()
|
||||
keys := r.URL.Query().Get("keys")
|
||||
var cacheKeys []string
|
||||
if keys != "" {
|
||||
cacheKeys = sharedCache.Keys()
|
||||
} else {
|
||||
cacheKeys = []string{}
|
||||
}
|
||||
|
||||
response := map[string]interface{}{
|
||||
"cache_size": cacheLen,
|
||||
"cache_keys": cacheKeys,
|
||||
"cache_max": cfg.CacheSize,
|
||||
"cache_size": cacheLen,
|
||||
"cache_max": cfg.CacheSize,
|
||||
"crawls_running": DirectoryCrawler.GetGlobalActiveCrawls(),
|
||||
"active_workers": DirectoryCrawler.ActiveWorkers,
|
||||
"busy_workers": DirectoryCrawler.ActiveWalks,
|
||||
"new_sync_running": elastic.ElasticRefreshSyncRunning,
|
||||
"refresh_sync_running": elastic.ElasticRefreshSyncRunning,
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
json.NewEncoder(w).Encode(response)
|
||||
err := json.NewEncoder(w).Encode(response)
|
||||
if err != nil {
|
||||
log.Errorf("AdminCacheInfo - Failed to serialize JSON: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
|
|
@ -1,21 +1,17 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/logging"
|
||||
"crazyfs/file"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"net/http"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func AdminReCache(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
log := logging.GetLogger()
|
||||
|
||||
func AdminReCache(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
if r.Method != http.MethodPost {
|
||||
helpers.Return400Msg("this is a POST endpoint", w)
|
||||
return
|
||||
|
@ -31,33 +27,25 @@ func AdminReCache(w http.ResponseWriter, r *http.Request, cfg *config.Config, sh
|
|||
|
||||
auth := requestBody["auth"]
|
||||
if auth == "" || auth != cfg.HttpAdminKey {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusForbidden)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 403,
|
||||
"error": "access denied",
|
||||
})
|
||||
helpers.Return403Msg("access denied", w)
|
||||
return
|
||||
}
|
||||
|
||||
pathArg := requestBody["path"]
|
||||
|
||||
// Clean the path to prevent directory traversal
|
||||
if strings.Contains(pathArg, "/../") || strings.HasPrefix(pathArg, "../") || strings.HasSuffix(pathArg, "/..") {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": http.StatusBadRequest,
|
||||
"error": "invalid file path",
|
||||
})
|
||||
fullPath, errJoin := file.SafeJoin(pathArg)
|
||||
traversalAttack, errTraverse := file.DetectTraversal(pathArg)
|
||||
if traversalAttack || errJoin != nil {
|
||||
log.Errorf("LIST - failed to clean path: %s - error: %s - traversal attack detected: %t - traversal attack detection: %s", pathArg, errJoin, traversalAttack, errTraverse)
|
||||
helpers.Return400Msg("invalid file path", w)
|
||||
return
|
||||
}
|
||||
|
||||
fullPath := filepath.Join(cfg.RootDir, filepath.Clean("/"+pathArg))
|
||||
//relPath := cache.StripRootDir(fullPath, cfg.RootDir)
|
||||
|
||||
// Check and re-cache the directory
|
||||
cache.Recache(fullPath, cfg, sharedCache)
|
||||
cache.Recache(fullPath, sharedCache)
|
||||
|
||||
response := map[string]interface{}{
|
||||
"message": "Re-cache triggered for directory: " + fullPath,
|
||||
|
@ -65,5 +53,9 @@ func AdminReCache(w http.ResponseWriter, r *http.Request, cfg *config.Config, sh
|
|||
log.Infof("Admin triggered recache for %s", fullPath)
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
json.NewEncoder(w).Encode(response)
|
||||
err = json.NewEncoder(w).Encode(response)
|
||||
if err != nil {
|
||||
log.Errorf("AdminRecache - Failed to serialize JSON: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
|
|
@ -1,57 +1,71 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/file"
|
||||
"crazyfs/logging"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"net/http"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func Download(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
if cache.InitialCrawlInProgress && !cfg.HttpAllowDuringInitialCrawl {
|
||||
func Download(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
if helpers.CheckInitialCrawl() {
|
||||
helpers.HandleRejectDuringInitialCrawl(w)
|
||||
return
|
||||
}
|
||||
|
||||
log := logging.GetLogger()
|
||||
|
||||
queryPath := r.URL.Query().Get("path")
|
||||
if queryPath == "" {
|
||||
pathArg := r.URL.Query().Get("path")
|
||||
if pathArg == "" {
|
||||
helpers.Return400Msg("missing path", w)
|
||||
return
|
||||
}
|
||||
|
||||
paths := strings.Split(queryPath, ",")
|
||||
paths := strings.Split(pathArg, ",")
|
||||
var cleanPaths []string
|
||||
if len(paths) > 1 {
|
||||
for _, path := range paths {
|
||||
cleanPath, errJoin := file.SafeJoin(path)
|
||||
traversalAttack, errTraverse := file.DetectTraversal(path)
|
||||
if traversalAttack || errJoin != nil {
|
||||
log.Errorf("DOWNLOAD - failed to clean path: %s - error: %s - traversal attack detected: %t - traversal attack detection: %s", path, errJoin, traversalAttack, errTraverse)
|
||||
helpers.Return400Msg("invalid file path", w)
|
||||
return
|
||||
}
|
||||
relPath := file.StripRootDir(cleanPath)
|
||||
|
||||
if helpers.CheckPathRestricted(relPath) {
|
||||
helpers.Return403Msg("not allowed to download this path", w)
|
||||
return
|
||||
}
|
||||
|
||||
cleanPaths = append(cleanPaths, cleanPath)
|
||||
}
|
||||
|
||||
// Multiple files, zip them
|
||||
file.ZipHandlerCompressMultiple(paths, w, r, cfg, sharedCache)
|
||||
helpers.ZipHandlerCompressMultiple(cleanPaths, w, r, cfg, sharedCache)
|
||||
return
|
||||
}
|
||||
|
||||
// Single file or directory
|
||||
relPath := cache.StripRootDir(filepath.Join(cfg.RootDir, paths[0]), cfg.RootDir)
|
||||
relPath = strings.TrimSuffix(relPath, "/")
|
||||
fullPath := filepath.Join(cfg.RootDir, relPath)
|
||||
fullPath, errJoin := file.SafeJoin(pathArg)
|
||||
traversalAttack, errTraverse := file.DetectTraversal(pathArg)
|
||||
if traversalAttack || errJoin != nil {
|
||||
log.Errorf("DOWNLOAD - failed to clean path: %s - error: %s - traversal attack detected: %t - traversal attack detection: %s", pathArg, errJoin, traversalAttack, errTraverse)
|
||||
helpers.Return400Msg("invalid file path", w)
|
||||
return
|
||||
}
|
||||
relPath := file.StripRootDir(fullPath)
|
||||
|
||||
// Check if the path is in the restricted download paths
|
||||
for _, restrictedPath := range cfg.RestrictedDownloadPaths {
|
||||
if relPath == restrictedPath {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusForbidden)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": http.StatusForbidden,
|
||||
"error": "not allowed to download this path",
|
||||
})
|
||||
return
|
||||
}
|
||||
if helpers.CheckPathRestricted(relPath) {
|
||||
helpers.Return403Msg("not allowed to download this path", w)
|
||||
return
|
||||
}
|
||||
|
||||
// Try to get the data from the cache
|
||||
|
@ -76,28 +90,25 @@ func Download(w http.ResponseWriter, r *http.Request, cfg *config.Config, shared
|
|||
var mimeType string
|
||||
var err error
|
||||
if item.Type == nil {
|
||||
fileExists, mimeType, _, err = cache.GetFileMime(fullPath, true)
|
||||
fileExists, mimeType, _, err = file.GetMimeType(fullPath, true, nil)
|
||||
if !fileExists {
|
||||
helpers.Return400Msg("file not found", w)
|
||||
}
|
||||
if err != nil {
|
||||
log.Warnf("Error detecting MIME type: %v", err)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 500,
|
||||
"error": "internal server error",
|
||||
})
|
||||
helpers.Return500Msg(w)
|
||||
return
|
||||
}
|
||||
// GetFileMime() returns an empty string if it was a directory
|
||||
// GetMimeType() returns an empty string if it was a directory
|
||||
if mimeType != "" {
|
||||
// Update the item's MIME in the sharedCache
|
||||
// Update the CacheItem's MIME in the sharedCache
|
||||
item.Type = &mimeType
|
||||
sharedCache.Add(relPath, item)
|
||||
}
|
||||
}
|
||||
|
||||
// https://stackoverflow.com/a/57994289
|
||||
|
||||
// Only files can have inline disposition, zip archives cannot
|
||||
contentDownload := r.URL.Query().Get("download")
|
||||
var disposition string
|
||||
|
@ -113,6 +124,6 @@ func Download(w http.ResponseWriter, r *http.Request, cfg *config.Config, shared
|
|||
} else {
|
||||
// Stream archive of the directory here
|
||||
w.Header().Set("Content-Disposition", fmt.Sprintf(`attachment; filename="%s.zip"`, item.Name))
|
||||
file.ZipHandlerCompress(fullPath, w, r)
|
||||
helpers.ZipHandlerCompress(fullPath, w, r)
|
||||
}
|
||||
}
|
|
@ -1,9 +1,10 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"net/http"
|
||||
|
@ -11,14 +12,18 @@ import (
|
|||
|
||||
// TODO: show the time the initial crawl started
|
||||
|
||||
func HealthCheck(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
func HealthCheck(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
//log := logging.GetLogger()
|
||||
|
||||
response := map[string]interface{}{}
|
||||
|
||||
//response["scan_running"] = cache.GetRunningScans() > 0
|
||||
response["scan_running"] = DirectoryCrawler.GetGlobalActiveCrawls() > 0
|
||||
response["initial_scan_running"] = cache.InitialCrawlInProgress
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
json.NewEncoder(w).Encode(response)
|
||||
err := json.NewEncoder(w).Encode(response)
|
||||
if err != nil {
|
||||
log.Errorf("HEALTH - Failed to serialize JSON: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
|
|
@ -0,0 +1,185 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/ResponseItem"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/config"
|
||||
"crazyfs/file"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"net/http"
|
||||
"strconv"
|
||||
)
|
||||
|
||||
func ListDir(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
if helpers.CheckInitialCrawl() {
|
||||
helpers.HandleRejectDuringInitialCrawl(w)
|
||||
return
|
||||
}
|
||||
|
||||
pathArg := r.URL.Query().Get("path")
|
||||
if pathArg == "" {
|
||||
helpers.Return400Msg("path parameter is required", w)
|
||||
return
|
||||
}
|
||||
|
||||
var err error
|
||||
|
||||
sortArg := r.URL.Query().Get("sort")
|
||||
var folderSorting string
|
||||
if sortArg == "default" || sortArg == "" {
|
||||
folderSorting = "default"
|
||||
} else if sortArg == "folders" {
|
||||
folderSorting = "folders"
|
||||
} else {
|
||||
helpers.Return400Msg("folders arg must be 'default' (to not do any sorting) or 'first' (to sort the folders to the front of the list)", w)
|
||||
return
|
||||
}
|
||||
|
||||
fullPath, errJoin := file.SafeJoin(pathArg)
|
||||
traversalAttack, errTraverse := file.DetectTraversal(pathArg)
|
||||
if traversalAttack || errJoin != nil {
|
||||
log.Errorf("LIST - failed to clean path: %s - error: %s - traversal attack detected: %t - traversal attack detection: %s", pathArg, errJoin, traversalAttack, errTraverse)
|
||||
helpers.Return400Msg("invalid file path", w)
|
||||
return
|
||||
}
|
||||
|
||||
relPath := file.StripRootDir(fullPath)
|
||||
// Try to get the data from the cache
|
||||
cacheItem, found := sharedCache.Get(relPath)
|
||||
if !found {
|
||||
cacheItem = helpers.HandleFileNotFound(relPath, fullPath, sharedCache, cfg, w)
|
||||
}
|
||||
if cacheItem == nil {
|
||||
return // The errors have already been handled in handleFileNotFound() so we're good to just exit
|
||||
}
|
||||
|
||||
// Create a copy of the cached Item so we don't modify the Item in the cache
|
||||
item := ResponseItem.NewResponseItem(cacheItem, sharedCache)
|
||||
|
||||
// Get the MIME type of the file if the 'mime' argument is present
|
||||
mime := r.URL.Query().Get("mime")
|
||||
if mime != "" {
|
||||
if item.IsDir && !cfg.HttpAllowDirMimeParse {
|
||||
helpers.Return403Msg("not allowed to analyze the mime of directories", w)
|
||||
return
|
||||
} else {
|
||||
// Only update the mime in the cache if it hasn't been set already.
|
||||
// TODO: need to make sure that when a re-crawl is triggered, the Type is set back to nil
|
||||
if item.Type == nil {
|
||||
fileExists, mimeType, ext, err := file.GetMimeType(fullPath, true, nil)
|
||||
if !fileExists {
|
||||
helpers.ReturnFake404Msg("file not found", w)
|
||||
}
|
||||
if err != nil {
|
||||
log.Warnf("Error detecting MIME type: %v", err)
|
||||
helpers.Return500Msg(w)
|
||||
return
|
||||
}
|
||||
// Update the original cached CacheItem's MIME in the sharedCache
|
||||
cacheItem.Type = &mimeType
|
||||
cacheItem.Extension = &ext
|
||||
sharedCache.Add(relPath, cacheItem) // take the address of CacheItem
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
response := map[string]interface{}{}
|
||||
|
||||
// Pagination
|
||||
var paginationLimit int
|
||||
if r.URL.Query().Get("limit") != "" {
|
||||
if !helpers.IsNonNegativeInt(r.URL.Query().Get("limit")) {
|
||||
helpers.Return400Msg("limit must be a positive number", w)
|
||||
return
|
||||
}
|
||||
paginationLimit, err = strconv.Atoi(r.URL.Query().Get("limit"))
|
||||
if err != nil {
|
||||
log.Errorf("Error parsing limit: %v", err)
|
||||
helpers.Return400Msg("limit must be a valid integer", w)
|
||||
return
|
||||
}
|
||||
} else {
|
||||
paginationLimit = 100
|
||||
}
|
||||
|
||||
totalItems := len(item.Children)
|
||||
totalPages := totalItems / paginationLimit
|
||||
if totalItems%paginationLimit != 0 {
|
||||
totalPages++
|
||||
}
|
||||
|
||||
if r.URL.Query().Get("page") != "" {
|
||||
response["total_pages"] = totalPages
|
||||
}
|
||||
|
||||
if folderSorting == "folders" {
|
||||
var dirs, files []*CacheItem.Item
|
||||
for _, child := range item.Children {
|
||||
if child.IsDir {
|
||||
dirs = append(dirs, child)
|
||||
} else {
|
||||
files = append(files, child)
|
||||
}
|
||||
}
|
||||
item.Children = append(dirs, files...)
|
||||
}
|
||||
|
||||
//Set the children to an empty array so that the JSON encoder doesn't return it as nil
|
||||
var paginatedChildren []*CacheItem.Item // this var is either the full CacheItem list or a paginated list depending on the query args
|
||||
if item.Children != nil {
|
||||
paginatedChildren = item.Children
|
||||
} else {
|
||||
paginatedChildren = make([]*CacheItem.Item, 0)
|
||||
}
|
||||
|
||||
pageParam := r.URL.Query().Get("page")
|
||||
if pageParam != "" {
|
||||
page, err := strconv.Atoi(pageParam)
|
||||
if err != nil || page < 1 || page > totalPages {
|
||||
// Don't return an error, just trunucate things
|
||||
page = totalPages
|
||||
}
|
||||
|
||||
start := (page - 1) * paginationLimit
|
||||
end := start + paginationLimit
|
||||
|
||||
if start >= 0 { // avoid segfaults
|
||||
if start > len(item.Children) {
|
||||
start = len(item.Children)
|
||||
}
|
||||
if end > len(item.Children) {
|
||||
end = len(item.Children)
|
||||
}
|
||||
paginatedChildren = paginatedChildren[start:end]
|
||||
}
|
||||
}
|
||||
|
||||
// Erase the children of the children so we aren't displaying things recursively
|
||||
for i := range paginatedChildren {
|
||||
paginatedChildren[i].Children = nil
|
||||
}
|
||||
|
||||
response["item"] = map[string]interface{}{
|
||||
"path": item.Path,
|
||||
"name": item.Name,
|
||||
"size": item.Size,
|
||||
"extension": item.Extension,
|
||||
"modified": item.Modified,
|
||||
"mode": item.Mode,
|
||||
"isDir": item.IsDir,
|
||||
"isSymlink": item.IsSymlink,
|
||||
"cached": item.Cached,
|
||||
"children": paginatedChildren,
|
||||
"type": item.Type,
|
||||
}
|
||||
|
||||
w.Header().Set("Cache-Control", "no-store")
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
err = json.NewEncoder(w).Encode(response)
|
||||
if err != nil {
|
||||
log.Errorf("LIST - Failed to serialize JSON: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
|
@ -1,23 +1,22 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"encoding/gob"
|
||||
"crazyfs/elastic"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"log"
|
||||
"net/http"
|
||||
"sort"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
)
|
||||
|
||||
func SearchFile(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
if cache.InitialCrawlInProgress && !cfg.HttpAllowDuringInitialCrawl {
|
||||
func SearchFile(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
if helpers.CheckInitialCrawl() {
|
||||
helpers.HandleRejectDuringInitialCrawl(w)
|
||||
return
|
||||
}
|
||||
|
@ -28,8 +27,10 @@ func SearchFile(w http.ResponseWriter, r *http.Request, cfg *config.Config, shar
|
|||
return
|
||||
}
|
||||
|
||||
queryString = strings.ToLower(queryString) // convert to lowercase
|
||||
//queryElements := strings.Split(queryString, " ") // split by spaces
|
||||
if !cfg.ElasticsearchEnable {
|
||||
// If we aren't using Elastic, convert the query to lowercase to reduce the complication.
|
||||
queryString = strings.ToLower(queryString)
|
||||
}
|
||||
|
||||
excludeString := r.URL.Query().Get("exclude") // get exclude parameter
|
||||
var excludeElements []string
|
||||
|
@ -40,7 +41,7 @@ func SearchFile(w http.ResponseWriter, r *http.Request, cfg *config.Config, shar
|
|||
limitResultsStr := r.URL.Query().Get("limit")
|
||||
var limitResults int
|
||||
if limitResultsStr != "" {
|
||||
if !helpers.IsPositiveInt(limitResultsStr) {
|
||||
if !helpers.IsNonNegativeInt(limitResultsStr) {
|
||||
helpers.Return400Msg("limit must be positive number", w)
|
||||
return
|
||||
}
|
||||
|
@ -51,60 +52,97 @@ func SearchFile(w http.ResponseWriter, r *http.Request, cfg *config.Config, shar
|
|||
|
||||
sortArg := r.URL.Query().Get("sort")
|
||||
var folderSorting string
|
||||
if sortArg == "default" || sortArg == "" {
|
||||
|
||||
switch sortArg {
|
||||
case "default", "":
|
||||
folderSorting = "default"
|
||||
} else if sortArg == "folders" {
|
||||
case "folders":
|
||||
folderSorting = "folders"
|
||||
} else {
|
||||
default:
|
||||
helpers.Return400Msg("folders arg must be 'default' (to not do any sorting) or 'first' (to sort the folders to the front of the list)", w)
|
||||
return
|
||||
}
|
||||
|
||||
results := make([]*data.Item, 0)
|
||||
outer:
|
||||
for _, key := range sharedCache.Keys() {
|
||||
cacheItem, found := sharedCache.Get(key)
|
||||
if found {
|
||||
//for _, query := range queryElements {
|
||||
if strings.Contains(strings.ToLower(key), queryString) { // query) { // convert key to lowercase
|
||||
// check if key contains any of the exclude elements
|
||||
shouldExclude := false
|
||||
for _, exclude := range excludeElements {
|
||||
if strings.Contains(strings.ToLower(key), exclude) {
|
||||
shouldExclude = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if shouldExclude {
|
||||
continue
|
||||
}
|
||||
searchStart := time.Now()
|
||||
|
||||
// Create a deep copy of the item
|
||||
var buf bytes.Buffer
|
||||
enc := gob.NewEncoder(&buf)
|
||||
dec := gob.NewDecoder(&buf)
|
||||
err := enc.Encode(cacheItem)
|
||||
if err != nil {
|
||||
log.Printf("Error encoding item: %v", err)
|
||||
return
|
||||
}
|
||||
var item data.Item
|
||||
err = dec.Decode(&item)
|
||||
if err != nil {
|
||||
log.Printf("Error decoding item: %v", err)
|
||||
return
|
||||
}
|
||||
if !cfg.ApiSearchShowChildren {
|
||||
item.Children = make([]*data.Item, 0) // erase the children dict
|
||||
}
|
||||
results = append(results, &item)
|
||||
if (limitResults > 0 && len(results) == limitResults) || len(results) >= cfg.ApiSearchMaxResults {
|
||||
break outer
|
||||
}
|
||||
}
|
||||
//}
|
||||
var results []*CacheItem.Item
|
||||
results = make([]*CacheItem.Item, 0)
|
||||
|
||||
if cfg.ElasticsearchEnable {
|
||||
// Perform the Elasticsearch query
|
||||
resp, err := elastic.Search(queryString, excludeElements, cfg)
|
||||
if err != nil {
|
||||
log.Errorf("SEARCH - Failed to perform Elasticsearch query: %s", err)
|
||||
helpers.Return500Msg(w)
|
||||
return
|
||||
}
|
||||
|
||||
// Parse the Elasticsearch response
|
||||
var respData map[string]interface{}
|
||||
err = json.NewDecoder(resp.Body).Decode(&respData)
|
||||
if err != nil {
|
||||
log.Errorf("SEARCH - Failed to parse Elasticsearch response: %s", err)
|
||||
helpers.Return500Msg(w)
|
||||
return
|
||||
}
|
||||
|
||||
if resp.IsError() || resp.StatusCode != 200 {
|
||||
// Elastic reported an error with the query.
|
||||
var errorMsg, clientResp string
|
||||
errorMsg, err = elastic.GetSearchFailureReason(respData)
|
||||
if err == nil {
|
||||
clientResp = errorMsg
|
||||
} else {
|
||||
clientResp = "Query failed"
|
||||
}
|
||||
helpers.Return400Msg(clientResp, w)
|
||||
return
|
||||
}
|
||||
|
||||
if respData["hits"] != nil {
|
||||
// Extract the results from the Elasticsearch response
|
||||
hits := respData["hits"].(map[string]interface{})["hits"].([]interface{})
|
||||
items := make([]*CacheItem.Item, len(hits))
|
||||
for i, hit := range hits {
|
||||
itemSource := hit.(map[string]interface{})["_source"].(map[string]interface{})
|
||||
|
||||
// Elastic does some things differently than us.
|
||||
var itemExtension *string
|
||||
if itemSource["extension"] != nil {
|
||||
extensionStr := itemSource["extension"].(string)
|
||||
itemExtension = &extensionStr
|
||||
}
|
||||
var itemType *string
|
||||
if itemSource["type"] != nil {
|
||||
typeStr := itemSource["extension"].(string)
|
||||
itemType = &typeStr
|
||||
}
|
||||
|
||||
//score := hit.(map[string]interface{})["_score"].(float64)
|
||||
item := &CacheItem.Item{
|
||||
Path: itemSource["path"].(string),
|
||||
Name: itemSource["name"].(string),
|
||||
Size: int64(itemSource["size"].(float64)),
|
||||
Extension: itemExtension,
|
||||
Modified: itemSource["modified"].(string),
|
||||
Mode: uint32(itemSource["mode"].(float64)),
|
||||
IsDir: itemSource["isDir"].(bool),
|
||||
IsSymlink: itemSource["isSymlink"].(bool),
|
||||
Type: itemType,
|
||||
Cached: int64(itemSource["cached"].(float64)),
|
||||
}
|
||||
items[i] = item
|
||||
}
|
||||
|
||||
// Sort the items by their Elasticsearch _score
|
||||
sort.Slice(items, func(i, j int) bool {
|
||||
return hits[i].(map[string]interface{})["_score"].(float64) > hits[j].(map[string]interface{})["_score"].(float64)
|
||||
})
|
||||
|
||||
results = append(results, items...)
|
||||
}
|
||||
} else {
|
||||
results = cache.SearchLRU(queryString, excludeElements, limitResults, sharedCache, cfg)
|
||||
}
|
||||
|
||||
if folderSorting == "folders" {
|
||||
|
@ -113,9 +151,17 @@ outer:
|
|||
})
|
||||
}
|
||||
|
||||
searchDuration := time.Since(searchStart).Round(time.Second)
|
||||
log.Infof("SEARCH - completed in %s and returned %d items", searchDuration, len(results))
|
||||
|
||||
w.Header().Set("Cache-Control", "no-store")
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
err := json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"results": results,
|
||||
})
|
||||
if err != nil {
|
||||
log.Errorf("SEARCH - Failed to serialize JSON: %s", err)
|
||||
helpers.Return500Msg(w)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
|
|
@ -0,0 +1,229 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/config"
|
||||
"crazyfs/file"
|
||||
"crazyfs/logging"
|
||||
"fmt"
|
||||
"github.com/disintegration/imaging"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"github.com/nfnt/resize"
|
||||
"strconv"
|
||||
|
||||
"image"
|
||||
"image/color"
|
||||
"image/png"
|
||||
"net/http"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func Thumbnail(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
if cache.InitialCrawlInProgress && !cfg.HttpAllowDuringInitialCrawl {
|
||||
helpers.HandleRejectDuringInitialCrawl(w)
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
|
||||
log := logging.GetLogger()
|
||||
relPath := file.StripRootDir(filepath.Join(cfg.RootDir, r.URL.Query().Get("path")))
|
||||
relPath = strings.TrimSuffix(relPath, "/")
|
||||
fullPath := filepath.Join(cfg.RootDir, relPath)
|
||||
|
||||
// Validate args before doing any operations
|
||||
width, err := getPositiveIntFromQuery(r, "width")
|
||||
if err != nil {
|
||||
helpers.Return400Msg("height and width must both be positive numbers", w)
|
||||
return
|
||||
}
|
||||
height, err := getPositiveIntFromQuery(r, "height")
|
||||
if err != nil {
|
||||
helpers.Return400Msg("height and width must both be positive numbers", w)
|
||||
return
|
||||
}
|
||||
|
||||
pngQuality, err := getPositiveIntFromQuery(r, "quality")
|
||||
if err != nil {
|
||||
helpers.Return400Msg("quality must be a positive number", w)
|
||||
return
|
||||
}
|
||||
if pngQuality == 0 {
|
||||
pngQuality = 50
|
||||
}
|
||||
|
||||
autoScale := r.URL.Query().Get("auto") != ""
|
||||
square := r.URL.Query().Get("square") != ""
|
||||
if (width != 0 && height != 0) && (width != height) {
|
||||
helpers.Return400Msg("width and height must be equal in square mode, or only one provided", w)
|
||||
return
|
||||
}
|
||||
|
||||
// Try to get the data from the cache
|
||||
item, found := sharedCache.Get(relPath)
|
||||
if !found {
|
||||
item = helpers.HandleFileNotFound(relPath, fullPath, sharedCache, cfg, w)
|
||||
}
|
||||
if item == nil {
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
|
||||
if item.IsDir {
|
||||
helpers.Return400Msg("that's a directory", w)
|
||||
return
|
||||
}
|
||||
|
||||
// Get the MIME type of the file
|
||||
fileExists, mimeType, ext, err := file.GetMimeType(fullPath, true, nil)
|
||||
if !fileExists {
|
||||
helpers.Return400Msg("file not found", w)
|
||||
return
|
||||
}
|
||||
if err != nil {
|
||||
log.Errorf("THUMB - error detecting MIME type: %v", err)
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
// Update the CacheItem's MIME in the sharedCache
|
||||
item.Type = &mimeType
|
||||
item.Extension = &ext
|
||||
sharedCache.Add(relPath, item)
|
||||
|
||||
// Check if the file is an image
|
||||
if !strings.HasPrefix(mimeType, "image/") {
|
||||
helpers.Return400Msg("file is not an image", w)
|
||||
return
|
||||
}
|
||||
|
||||
// Convert the image to a PNG
|
||||
imageBytes, err := file.ConvertToPNG(fullPath, mimeType)
|
||||
if err != nil {
|
||||
log.Warnf("Error converting %s to PNG: %v", fullPath, err)
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
|
||||
// Decode the image
|
||||
var img image.Image
|
||||
img, err = png.Decode(bytes.NewReader(imageBytes))
|
||||
if err != nil {
|
||||
log.Warnf("Error decoding %s image data: %v", fullPath, err)
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
|
||||
// Resize the image
|
||||
img, err = resizeImage(img, width, height, square, autoScale)
|
||||
if err != nil {
|
||||
helpers.Return400Msg(err.Error(), w)
|
||||
return
|
||||
}
|
||||
|
||||
buf, err := file.CompressPNGFile(img, pngQuality)
|
||||
if err != nil {
|
||||
log.Warnf("Error compressing %s to PNG: %v", fullPath, err)
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
|
||||
// Return the image
|
||||
w.Header().Set("Content-Type", "image/png")
|
||||
w.Write(buf.Bytes())
|
||||
}
|
||||
|
||||
func getPositiveIntFromQuery(r *http.Request, key string) (int, error) {
|
||||
str := r.URL.Query().Get(key)
|
||||
if str == "" {
|
||||
return 0, nil
|
||||
}
|
||||
if !helpers.IsNonNegativeInt(str) {
|
||||
return 0, fmt.Errorf("invalid value for %s", key)
|
||||
}
|
||||
value, _ := strconv.ParseInt(str, 10, 32)
|
||||
return int(value), nil
|
||||
}
|
||||
|
||||
func returnDummyPNG(w http.ResponseWriter) {
|
||||
img := image.NewRGBA(image.Rect(0, 0, 300, 300))
|
||||
blue := color.RGBA{R: 255, G: 255, B: 255, A: 255}
|
||||
|
||||
for y := 0; y < img.Bounds().Dy(); y++ {
|
||||
for x := 0; x < img.Bounds().Dx(); x++ {
|
||||
img.Set(x, y, blue)
|
||||
}
|
||||
}
|
||||
|
||||
buffer := new(bytes.Buffer)
|
||||
if err := png.Encode(buffer, img); err != nil {
|
||||
http.Error(w, "encode failed", http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
|
||||
// TODO: set cache-control based on config?
|
||||
|
||||
w.Header().Set("Content-Type", "image/png")
|
||||
_, err := w.Write(buffer.Bytes())
|
||||
if err != nil {
|
||||
log.Errorf("THUMBNAIL - Failed to write buffer: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
func resizeImage(img image.Image, width, height int, square, autoScale bool) (image.Image, error) {
|
||||
if square {
|
||||
var size int
|
||||
if width == 0 && height == 0 {
|
||||
size = 300
|
||||
} else if width != 0 {
|
||||
size = width
|
||||
} else {
|
||||
size = height
|
||||
}
|
||||
if size > img.Bounds().Dx() || size > img.Bounds().Dy() {
|
||||
size = helpers.Max(img.Bounds().Dx(), img.Bounds().Dy())
|
||||
}
|
||||
|
||||
// First, make the image square by scaling the smallest dimension to the larget size
|
||||
if img.Bounds().Dx() > img.Bounds().Dy() {
|
||||
width = 0
|
||||
height = size
|
||||
} else {
|
||||
width = size
|
||||
height = 0
|
||||
}
|
||||
resized := resize.Resize(uint(width), uint(height), img, resize.Lanczos3)
|
||||
|
||||
// Then crop the image to the target size
|
||||
img = imaging.CropCenter(resized, size, size)
|
||||
} else {
|
||||
if width == 0 && height == 0 {
|
||||
if autoScale {
|
||||
// If both width and height parameters are not provided, set
|
||||
// the largest dimension to 300 and scale the other.
|
||||
if img.Bounds().Dx() > img.Bounds().Dy() {
|
||||
width = 300
|
||||
height = 0
|
||||
} else {
|
||||
width = 0
|
||||
height = 300
|
||||
}
|
||||
} else {
|
||||
// Don't auto-resize because this endpoint can also be used for simply reducing the quality of an image
|
||||
width = img.Bounds().Dx()
|
||||
height = img.Bounds().Dy()
|
||||
}
|
||||
} else if width == 0 {
|
||||
// If only width is provided, calculate the height based on the image's aspect ratio
|
||||
width = img.Bounds().Dx() * height / img.Bounds().Dy()
|
||||
} else if height == 0 {
|
||||
height = img.Bounds().Dy() * width / img.Bounds().Dx()
|
||||
}
|
||||
// Scale the image. If the image is smaller than the provided height or width, it won't be resized.
|
||||
img = resize.Resize(uint(width), uint(height), img, resize.Lanczos3)
|
||||
}
|
||||
return img, nil
|
||||
}
|
|
@ -0,0 +1,27 @@
|
|||
package client
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
// TODO: show the time the initial crawl started
|
||||
|
||||
func ClientHealthCheck(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
response := map[string]interface{}{}
|
||||
|
||||
response["scan_running"] = DirectoryCrawler.GetGlobalActiveCrawls() > 0
|
||||
response["initial_scan_running"] = cache.InitialCrawlInProgress
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
err := json.NewEncoder(w).Encode(response)
|
||||
if err != nil {
|
||||
log.Errorf("HEALTH - Failed to serialize JSON: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
|
@ -0,0 +1,22 @@
|
|||
package client
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/config"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
func RestrictedDownloadDirectories(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
response := map[string]interface{}{
|
||||
"restricted_download_directories": config.RestrictedDownloadPaths,
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
err := json.NewEncoder(w).Encode(response)
|
||||
if err != nil {
|
||||
log.Errorf("AdminCacheInfo - Failed to serialize JSON: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
|
@ -0,0 +1,12 @@
|
|||
package client
|
||||
|
||||
import (
|
||||
"crazyfs/logging"
|
||||
"github.com/sirupsen/logrus"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
}
|
|
@ -0,0 +1,12 @@
|
|||
package helpers
|
||||
|
||||
import (
|
||||
"crazyfs/logging"
|
||||
"github.com/sirupsen/logrus"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
}
|
|
@ -1,29 +1,44 @@
|
|||
package helpers
|
||||
|
||||
import (
|
||||
"crazyfs/logging"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
)
|
||||
|
||||
func Return400Msg(msg string, w http.ResponseWriter) {
|
||||
func WriteErrorResponse(json_code, http_code int, msg string, w http.ResponseWriter) {
|
||||
//log := logging.GetLogger()
|
||||
//log.Warnln(msg)
|
||||
|
||||
w.Header().Set("Cache-Control", "no-store")
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": http.StatusBadRequest,
|
||||
w.WriteHeader(http_code)
|
||||
|
||||
err := json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": json_code,
|
||||
"error": msg,
|
||||
})
|
||||
|
||||
if err != nil {
|
||||
log.Errorln("HELPERS - WriteErrorResponse failed to encode JSON response: ", err)
|
||||
}
|
||||
}
|
||||
|
||||
func ReturnFake404Msg(msg string, w http.ResponseWriter) {
|
||||
WriteErrorResponse(404, http.StatusBadRequest, msg, w)
|
||||
}
|
||||
|
||||
func Return400Msg(msg string, w http.ResponseWriter) {
|
||||
WriteErrorResponse(http.StatusBadRequest, http.StatusBadRequest, msg, w)
|
||||
}
|
||||
|
||||
func HandleRejectDuringInitialCrawl(w http.ResponseWriter) {
|
||||
log := logging.GetLogger()
|
||||
log.Warnln("Rejecting request during initial crawl")
|
||||
w.Header().Set("Cache-Control", "no-store")
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusServiceUnavailable)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": http.StatusServiceUnavailable,
|
||||
"error": "initial file system crawl in progress",
|
||||
})
|
||||
WriteErrorResponse(http.StatusServiceUnavailable, http.StatusServiceUnavailable, "initial file system crawl in progress", w)
|
||||
}
|
||||
|
||||
func Return500Msg(w http.ResponseWriter) {
|
||||
WriteErrorResponse(http.StatusInternalServerError, http.StatusInternalServerError, "internal server error", w)
|
||||
}
|
||||
|
||||
func Return403Msg(msg string, w http.ResponseWriter) {
|
||||
WriteErrorResponse(http.StatusForbidden, http.StatusForbidden, msg, w)
|
||||
}
|
||||
|
|
|
@ -1,36 +1,65 @@
|
|||
package helpers
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/logging"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"net/http"
|
||||
"os"
|
||||
"strconv"
|
||||
"time"
|
||||
)
|
||||
|
||||
func HandleFileNotFound(relPath string, fullPath string, sharedCache *lru.Cache[string, *data.Item], cfg *config.Config, w http.ResponseWriter) *data.Item {
|
||||
// HandleFileNotFound if the data is not in the cache, start a new crawler
|
||||
func HandleFileNotFound(relPath string, fullPath string, sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config, w http.ResponseWriter) *CacheItem.Item {
|
||||
log := logging.GetLogger()
|
||||
// If the data is not in the cache, start a new crawler
|
||||
|
||||
//log.Fatalf("CRAWLER - %s not in cache, crawling", fullPath)
|
||||
|
||||
log.Debugf("CRAWLER - %s not in cache, crawling", fullPath)
|
||||
pool := cache.NewWorkerPool()
|
||||
crawler := cache.NewDirectoryCrawler(sharedCache, pool)
|
||||
dc := DirectoryCrawler.NewDirectoryCrawler(sharedCache)
|
||||
|
||||
// Check if this is a symlink. We do this before CrawlNoRecursion() because we want to tell the end user that
|
||||
// we're not going to resolve this symlink.
|
||||
//info, err := os.Lstat(fullPath)
|
||||
//if err != nil {
|
||||
// log.Errorf("HandleFileNotFound - os.Lstat failed: %s", err)
|
||||
// Return500Msg(w)
|
||||
// return nil
|
||||
//}
|
||||
//if !config.FollowSymlinks && info.Mode()&os.ModeSymlink > 0 {
|
||||
// Return400Msg("path is a symlink", w)
|
||||
// return nil
|
||||
//}
|
||||
|
||||
// We don't want to traverse the entire directory tree since we'll only return the current directory anyways
|
||||
err := crawler.Crawl(fullPath, false)
|
||||
if err != nil {
|
||||
log.Errorf("LIST - crawl failed: %s", err)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 500,
|
||||
"error": "internal server error",
|
||||
})
|
||||
item, err := dc.CrawlNoRecursion(fullPath)
|
||||
|
||||
if os.IsNotExist(err) || item == nil {
|
||||
ReturnFake404Msg("path not found", w)
|
||||
return nil
|
||||
} else if err != nil {
|
||||
log.Errorf("HandleFileNotFound - crawl failed: %s", err)
|
||||
Return500Msg(w)
|
||||
return nil
|
||||
}
|
||||
|
||||
// Start a recursive crawl in the background.
|
||||
// We've already gotten our cached CacheItem (may be null if it doesn't exist) so this won't affect our results
|
||||
go func() {
|
||||
log.Debugf("Starting background recursive crawl for %s", fullPath)
|
||||
dc := DirectoryCrawler.NewDirectoryCrawler(sharedCache)
|
||||
start := time.Now()
|
||||
err := dc.Crawl(fullPath, true)
|
||||
if err != nil {
|
||||
log.Errorf("LIST - background recursive crawl failed: %s", err)
|
||||
}
|
||||
log.Debugf("Finished background recursive crawl for %s, elapsed time: %s", fullPath, time.Since(start).Round(time.Second))
|
||||
}()
|
||||
|
||||
// Try to get the data from the cache again
|
||||
item, found := sharedCache.Get(relPath)
|
||||
if !found {
|
||||
|
@ -39,42 +68,27 @@ func HandleFileNotFound(relPath string, fullPath string, sharedCache *lru.Cache[
|
|||
if _, err := os.Stat(fullPath); os.IsNotExist(err) {
|
||||
log.Debugf("File not in cache: %s", fullPath)
|
||||
// If the file or directory does not exist, return a 404 status code and a message
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusNotFound)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 400,
|
||||
"error": "file or directory not found",
|
||||
})
|
||||
ReturnFake404Msg("file or directory not found", w)
|
||||
return nil
|
||||
} else if err != nil {
|
||||
// If there was an error checking if the file or directory exists, return a 500 status code and the error
|
||||
log.Errorf("LIST - %s", err.Error())
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 500,
|
||||
"error": "internal server error",
|
||||
})
|
||||
Return500Msg(w)
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
// If item is still nil, error
|
||||
// If CacheItem is still nil, error
|
||||
if item == nil {
|
||||
log.Errorf("LIST - crawler failed to find %s and did not return a 404", relPath)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 500,
|
||||
"error": "crawler failed to fetch file or directory",
|
||||
})
|
||||
Return500Msg(w)
|
||||
return nil
|
||||
}
|
||||
cache.CheckAndRecache(fullPath, cfg, sharedCache)
|
||||
return item
|
||||
}
|
||||
|
||||
func IsPositiveInt(testStr string) bool {
|
||||
func IsNonNegativeInt(testStr string) bool {
|
||||
if num, err := strconv.ParseInt(testStr, 10, 64); err == nil {
|
||||
return num >= 0
|
||||
}
|
||||
|
@ -94,3 +108,19 @@ func Max(a, b int) int {
|
|||
}
|
||||
return b
|
||||
}
|
||||
|
||||
func CheckInitialCrawl() bool {
|
||||
return cache.InitialCrawlInProgress && !config.HttpAllowDuringInitialCrawl
|
||||
}
|
||||
|
||||
func CheckPathRestricted(relPath string) bool {
|
||||
for _, restrictedPath := range config.RestrictedDownloadPaths {
|
||||
if restrictedPath == "" {
|
||||
restrictedPath = "/"
|
||||
}
|
||||
if relPath == restrictedPath {
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
|
|
@ -0,0 +1,126 @@
|
|||
package helpers
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/config"
|
||||
"crazyfs/file"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
kzip "github.com/klauspost/compress/zip"
|
||||
"io"
|
||||
"net/http"
|
||||
"os"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func ZipHandlerCompress(dirPath string, w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/zip")
|
||||
//w.WriteHeader(http.StatusOK)
|
||||
|
||||
zipWriter := kzip.NewWriter(w)
|
||||
// Walk through the directory and add each file to the zip
|
||||
filepath.Walk(dirPath, func(filePath string, info os.FileInfo, err error) error {
|
||||
if info.IsDir() {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Ensure the file path is relative to the directory being zipped
|
||||
relativePath, err := filepath.Rel(dirPath, filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
writer, err := zipWriter.Create(relativePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
file, err := os.Open(filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
_, err = io.Copy(writer, file)
|
||||
return err
|
||||
})
|
||||
|
||||
err := zipWriter.Close()
|
||||
if err != nil {
|
||||
log.Errorf("ZIPSTREM - failed to close zipwriter: %s", err)
|
||||
}
|
||||
}
|
||||
func ZipHandlerCompressMultiple(paths []string, w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
zipWriter := kzip.NewWriter(w)
|
||||
// Walk through each file and add it to the zip
|
||||
for _, fullPath := range paths {
|
||||
relPath := file.StripRootDir(fullPath)
|
||||
|
||||
// Try to get the data from the cache
|
||||
item, found := sharedCache.Get(relPath)
|
||||
if !found {
|
||||
item = HandleFileNotFound(relPath, fullPath, sharedCache, cfg, w)
|
||||
}
|
||||
if item == nil {
|
||||
// The errors have already been handled in handleFileNotFound() so we're good to just exit
|
||||
return
|
||||
}
|
||||
|
||||
if !item.IsDir {
|
||||
writer, err := zipWriter.Create(relPath)
|
||||
if err != nil {
|
||||
Return500Msg(w)
|
||||
return
|
||||
}
|
||||
|
||||
file, err := os.Open(fullPath)
|
||||
if err != nil {
|
||||
Return500Msg(w)
|
||||
return
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
_, err = io.Copy(writer, file)
|
||||
if err != nil {
|
||||
Return500Msg(w)
|
||||
return
|
||||
}
|
||||
} else {
|
||||
w.Header().Set("Content-Disposition", `attachment; filename="files.zip"`)
|
||||
w.Header().Set("Content-Type", "application/zip")
|
||||
//w.WriteHeader(http.StatusOK)
|
||||
|
||||
// If it's a directory, walk through it and add each file to the zip
|
||||
filepath.Walk(fullPath, func(filePath string, info os.FileInfo, err error) error {
|
||||
if info.IsDir() {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Ensure the file path is relative to the directory being zipped
|
||||
relativePath, err := filepath.Rel(fullPath, filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
writer, err := zipWriter.Create(filepath.Join(relPath, relativePath))
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
file, err := os.Open(filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
_, err = io.Copy(writer, file)
|
||||
return err
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
err := zipWriter.Close()
|
||||
if err != nil {
|
||||
log.Errorf("ZIPSTREM - failed to close zipwriter: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
|
@ -0,0 +1,12 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"crazyfs/logging"
|
||||
"github.com/sirupsen/logrus"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
}
|
222
src/api/list.go
222
src/api/list.go
|
@ -1,222 +0,0 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/logging"
|
||||
"encoding/gob"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"math"
|
||||
"net/http"
|
||||
"path/filepath"
|
||||
"sort"
|
||||
"strconv"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func ListDir(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
if cache.InitialCrawlInProgress && !cfg.HttpAllowDuringInitialCrawl {
|
||||
helpers.HandleRejectDuringInitialCrawl(w)
|
||||
return
|
||||
}
|
||||
|
||||
log := logging.GetLogger()
|
||||
pathArg := r.URL.Query().Get("path")
|
||||
|
||||
sortArg := r.URL.Query().Get("sort")
|
||||
var folderSorting string
|
||||
if sortArg == "default" || sortArg == "" {
|
||||
folderSorting = "default"
|
||||
} else if sortArg == "folders" {
|
||||
folderSorting = "folders"
|
||||
} else {
|
||||
helpers.Return400Msg("folders arg must be 'default' (to not do any sorting) or 'first' (to sort the folders to the front of the list)", w)
|
||||
return
|
||||
}
|
||||
|
||||
// Clean the path to prevent directory traversal
|
||||
// filepath.Clean() below will do most of the work but these are just a few checks
|
||||
// Also this will break the cache because it will create another entry for the relative path
|
||||
if strings.Contains(pathArg, "/../") || strings.HasPrefix(pathArg, "../") || strings.HasSuffix(pathArg, "/..") || strings.HasPrefix(pathArg, "~") {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": http.StatusBadRequest,
|
||||
"error": "invalid file path",
|
||||
})
|
||||
return
|
||||
}
|
||||
|
||||
fullPath := filepath.Join(cfg.RootDir, filepath.Clean("/"+pathArg))
|
||||
relPath := cache.StripRootDir(fullPath, cfg.RootDir)
|
||||
// Try to get the data from the cache
|
||||
cacheItem, found := sharedCache.Get(relPath)
|
||||
if !found {
|
||||
cacheItem = helpers.HandleFileNotFound(relPath, fullPath, sharedCache, cfg, w)
|
||||
// Start a recursive crawl in the background.
|
||||
// We've already gotten our cached item (may be null if it doesn't exist) so this won't affect our results
|
||||
go func() {
|
||||
log.Debugf("LIST - starting background recursive crawl for %s", fullPath)
|
||||
pool := cache.NewWorkerPool()
|
||||
crawler := cache.NewDirectoryCrawler(sharedCache, pool)
|
||||
err := crawler.Crawl(fullPath, true)
|
||||
if err != nil {
|
||||
log.Errorf("LIST - background recursive crawl failed: %s", err)
|
||||
}
|
||||
}()
|
||||
}
|
||||
if cacheItem == nil {
|
||||
return // The errors have already been handled in handleFileNotFound() so we're good to just exit
|
||||
}
|
||||
|
||||
// Create a deep copy of the cached item so we don't modify the item in the cache
|
||||
var buf bytes.Buffer
|
||||
enc := gob.NewEncoder(&buf)
|
||||
dec := gob.NewDecoder(&buf)
|
||||
err := enc.Encode(cacheItem)
|
||||
if err != nil {
|
||||
log.Errorf("Error encoding item: %v", err)
|
||||
return
|
||||
}
|
||||
var item data.Item
|
||||
err = dec.Decode(&item)
|
||||
if err != nil {
|
||||
log.Errorf("Error decoding item: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
// Get the MIME type of the file if the 'mime' argument is present
|
||||
mime := r.URL.Query().Get("mime")
|
||||
if mime != "" {
|
||||
if item.IsDir && !cfg.HttpAllowDirMimeParse {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusForbidden)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 403,
|
||||
"error": "not allowed to analyze the mime of directories",
|
||||
})
|
||||
return
|
||||
} else if !item.IsDir {
|
||||
// Only update the mime in the cache if it hasn't been set already.
|
||||
// TODO: need to make sure that when a re-crawl is triggered, the Type is set back to nil
|
||||
if item.Type == nil {
|
||||
fileExists, mimeType, ext, err := cache.GetFileMime(fullPath, true)
|
||||
if !fileExists {
|
||||
helpers.Return400Msg("file not found", w)
|
||||
}
|
||||
if err != nil {
|
||||
log.Warnf("Error detecting MIME type: %v", err)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 500,
|
||||
"error": "internal server error",
|
||||
})
|
||||
return
|
||||
}
|
||||
// Update the original cached item's MIME in the sharedCache
|
||||
cacheItem.Type = &mimeType
|
||||
cacheItem.Extension = &ext
|
||||
sharedCache.Add(relPath, cacheItem) // take the address of item
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
response := map[string]interface{}{}
|
||||
|
||||
// Pagination
|
||||
var paginationLimit int
|
||||
if r.URL.Query().Get("limit") != "" {
|
||||
if !helpers.IsPositiveInt(r.URL.Query().Get("limit")) {
|
||||
helpers.Return400Msg("limit must be positive number", w)
|
||||
return
|
||||
}
|
||||
i, _ := strconv.ParseInt(r.URL.Query().Get("limit"), 10, 32)
|
||||
paginationLimit = int(i)
|
||||
} else {
|
||||
paginationLimit = 100
|
||||
}
|
||||
|
||||
totalPages := math.Ceil(float64(len(item.Children)) / float64(paginationLimit))
|
||||
if r.URL.Query().Get("page") != "" {
|
||||
response["total_pages"] = int(totalPages)
|
||||
}
|
||||
|
||||
if folderSorting == "folders" {
|
||||
sort.Slice(item.Children, func(i, j int) bool {
|
||||
return item.Children[i].IsDir && !item.Children[j].IsDir
|
||||
})
|
||||
}
|
||||
|
||||
// Set the children to an empty array so that the JSON encoder doesn't return it as nil
|
||||
var paginatedChildren []*data.Item // this var is either the full item list or a paginated list depending on the query args
|
||||
if item.Children != nil {
|
||||
paginatedChildren = item.Children
|
||||
} else {
|
||||
paginatedChildren = make([]*data.Item, 0)
|
||||
}
|
||||
|
||||
pageParam := r.URL.Query().Get("page")
|
||||
if pageParam != "" {
|
||||
page, err := strconv.Atoi(pageParam)
|
||||
if err != nil || page < 1 || page > int(totalPages) {
|
||||
//w.Header().Set("Content-Type", "application/json")
|
||||
//w.WriteHeader(http.StatusBadRequest)
|
||||
//json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
// "code": http.StatusBadRequest,
|
||||
// "error": "invalid page number",
|
||||
// "total_pages": int(totalPages),
|
||||
//})
|
||||
//return
|
||||
|
||||
// Don't return an error, just trunucate things
|
||||
page = int(totalPages)
|
||||
}
|
||||
|
||||
start := (page - 1) * paginationLimit
|
||||
end := start + paginationLimit
|
||||
|
||||
if start >= 0 { // avoid segfaults
|
||||
if start > len(item.Children) {
|
||||
start = len(item.Children)
|
||||
}
|
||||
if end > len(item.Children) {
|
||||
end = len(item.Children)
|
||||
}
|
||||
paginatedChildren = paginatedChildren[start:end]
|
||||
}
|
||||
}
|
||||
|
||||
// TODO: don't use depriciated file read methods
|
||||
|
||||
//if cfg.HttpAPIListCacheControl > 0 {
|
||||
// w.Header().Set("Cache-Control", fmt.Sprintf("public, max-age=%d, must-revalidate", cfg.HttpAPIListCacheControl))
|
||||
//} else {
|
||||
w.Header().Set("Cache-Control", "no-store")
|
||||
//}
|
||||
|
||||
for i := range paginatedChildren {
|
||||
paginatedChildren[i].Children = nil
|
||||
}
|
||||
|
||||
response["item"] = map[string]interface{}{
|
||||
"path": item.Path,
|
||||
"name": item.Name,
|
||||
"size": item.Size,
|
||||
"extension": item.Extension,
|
||||
"modified": item.Modified,
|
||||
"mode": item.Mode,
|
||||
"isDir": item.IsDir,
|
||||
"isSymlink": item.IsSymlink,
|
||||
"cached": item.Cached,
|
||||
"children": paginatedChildren,
|
||||
"type": item.Type,
|
||||
}
|
||||
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
json.NewEncoder(w).Encode(response)
|
||||
}
|
|
@ -1,8 +1,9 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/api/client"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/logging"
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
|
@ -20,7 +21,7 @@ type Route struct {
|
|||
|
||||
type Routes []Route
|
||||
|
||||
type AppHandler func(http.ResponseWriter, *http.Request, *config.Config, *lru.Cache[string, *data.Item])
|
||||
type AppHandler func(http.ResponseWriter, *http.Request, *config.Config, *lru.Cache[string, *CacheItem.Item])
|
||||
|
||||
var routes = Routes{
|
||||
Route{
|
||||
|
@ -56,13 +57,13 @@ var routes = Routes{
|
|||
Route{
|
||||
"Trigger Recache",
|
||||
"POST",
|
||||
"/api/admin/recache",
|
||||
"/api/admin/cache/recache",
|
||||
AdminReCache,
|
||||
},
|
||||
Route{
|
||||
"Trigger Recache",
|
||||
"GET",
|
||||
"/api/admin/recache",
|
||||
"/api/admin/cache/recache",
|
||||
wrongMethod("POST", AdminReCache),
|
||||
},
|
||||
Route{
|
||||
|
@ -71,6 +72,27 @@ var routes = Routes{
|
|||
"/api/health",
|
||||
HealthCheck,
|
||||
},
|
||||
|
||||
// TODO: remove
|
||||
Route{
|
||||
"Server Health",
|
||||
"GET",
|
||||
"/api/health",
|
||||
HealthCheck,
|
||||
},
|
||||
|
||||
Route{
|
||||
"Server Health",
|
||||
"GET",
|
||||
"/api/client/health",
|
||||
client.ClientHealthCheck,
|
||||
},
|
||||
Route{
|
||||
"Restricted Directories",
|
||||
"GET",
|
||||
"/api/client/restricted",
|
||||
client.RestrictedDownloadDirectories,
|
||||
},
|
||||
}
|
||||
|
||||
func setHeaders(next http.Handler) http.Handler {
|
||||
|
@ -82,7 +104,7 @@ func setHeaders(next http.Handler) http.Handler {
|
|||
})
|
||||
}
|
||||
|
||||
func NewRouter(cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) *mux.Router {
|
||||
func NewRouter(cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) *mux.Router {
|
||||
r := mux.NewRouter().StrictSlash(true)
|
||||
for _, route := range routes {
|
||||
var handler http.Handler
|
||||
|
@ -117,7 +139,7 @@ func NewRouter(cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) *
|
|||
}
|
||||
|
||||
func wrongMethod(expectedMethod string, next AppHandler) AppHandler {
|
||||
return func(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
return func(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
|
|
|
@ -1,257 +0,0 @@
|
|||
package api
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/file"
|
||||
"crazyfs/logging"
|
||||
"encoding/json"
|
||||
"github.com/disintegration/imaging"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
|
||||
"image"
|
||||
"image/color"
|
||||
"image/png"
|
||||
"net/http"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"strings"
|
||||
|
||||
"github.com/nfnt/resize"
|
||||
)
|
||||
|
||||
func Thumbnail(w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
if cache.InitialCrawlInProgress && !cfg.HttpAllowDuringInitialCrawl {
|
||||
helpers.HandleRejectDuringInitialCrawl(w)
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
|
||||
log := logging.GetLogger()
|
||||
relPath := cache.StripRootDir(filepath.Join(cfg.RootDir, r.URL.Query().Get("path")), cfg.RootDir)
|
||||
relPath = strings.TrimSuffix(relPath, "/")
|
||||
fullPath := filepath.Join(cfg.RootDir, relPath)
|
||||
|
||||
// Validate args before doing any operations
|
||||
widthStr := r.URL.Query().Get("width")
|
||||
if widthStr != "" {
|
||||
if !helpers.IsPositiveInt(widthStr) {
|
||||
helpers.Return400Msg("height and width must both be positive numbers", w)
|
||||
return
|
||||
}
|
||||
}
|
||||
heightStr := r.URL.Query().Get("height")
|
||||
if heightStr != "" {
|
||||
if !helpers.IsPositiveInt(heightStr) {
|
||||
helpers.Return400Msg("height and width must both be positive numbers", w)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
pngQualityStr := r.URL.Query().Get("quality")
|
||||
var pngQuality int
|
||||
if pngQualityStr != "" {
|
||||
if !helpers.IsPositiveInt(pngQualityStr) {
|
||||
helpers.Return400Msg("quality must be a positive number", w)
|
||||
return
|
||||
}
|
||||
pngQuality64, _ := strconv.ParseInt(pngQualityStr, 10, 32)
|
||||
pngQuality = int(pngQuality64)
|
||||
} else {
|
||||
pngQuality = 50
|
||||
}
|
||||
|
||||
scaleStr := r.URL.Query().Get("auto")
|
||||
var autoScale bool
|
||||
if scaleStr != "" {
|
||||
autoScale = true
|
||||
}
|
||||
|
||||
squareStr := r.URL.Query().Get("square")
|
||||
var square bool
|
||||
if squareStr != "" {
|
||||
square = true
|
||||
}
|
||||
|
||||
// Try to get the data from the cache
|
||||
item, found := sharedCache.Get(relPath)
|
||||
if !found {
|
||||
item = helpers.HandleFileNotFound(relPath, fullPath, sharedCache, cfg, w)
|
||||
}
|
||||
if item == nil {
|
||||
return
|
||||
}
|
||||
|
||||
if item.IsDir {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": http.StatusBadRequest,
|
||||
"error": "that's a directory",
|
||||
})
|
||||
return
|
||||
}
|
||||
|
||||
// Get the MIME type of the file
|
||||
fileExists, mimeType, ext, err := cache.GetFileMime(fullPath, true)
|
||||
if !fileExists {
|
||||
helpers.Return400Msg("file not found", w)
|
||||
}
|
||||
if err != nil {
|
||||
log.Errorf("THUMB = error detecting MIME type: %v", err)
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": 500,
|
||||
"error": "internal server error",
|
||||
})
|
||||
return
|
||||
}
|
||||
// Update the item's MIME in the sharedCache
|
||||
item.Type = &mimeType
|
||||
item.Extension = &ext
|
||||
sharedCache.Add(relPath, item)
|
||||
|
||||
// Check if the file is an image
|
||||
if !strings.HasPrefix(mimeType, "image/") {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": http.StatusBadRequest,
|
||||
"error": "file is not an image",
|
||||
})
|
||||
return
|
||||
}
|
||||
|
||||
// Convert the image to a PNG
|
||||
imageBytes, err := file.ConvertToPNG(fullPath, mimeType)
|
||||
if err != nil {
|
||||
log.Warnf("Error converting %s to PNG: %v", fullPath, err)
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
|
||||
// Decode the image
|
||||
var img image.Image
|
||||
img, err = png.Decode(bytes.NewReader(imageBytes))
|
||||
if err != nil {
|
||||
log.Warnf("Error decoding %s image data: %v", fullPath, err)
|
||||
returnDummyPNG(w)
|
||||
return
|
||||
}
|
||||
|
||||
// Resize the image
|
||||
var width, height uint
|
||||
if widthStr != "" {
|
||||
width64, _ := strconv.ParseUint(widthStr, 10, 32)
|
||||
width = uint(width64)
|
||||
}
|
||||
if heightStr != "" {
|
||||
height64, _ := strconv.ParseUint(heightStr, 10, 32)
|
||||
height = uint(height64)
|
||||
}
|
||||
|
||||
if square {
|
||||
var size int
|
||||
if width == 0 && height == 0 {
|
||||
size = 300
|
||||
} else if (width != 0 && height != 0) && (width != height) {
|
||||
helpers.Return400Msg("width and height must be equal in square mode, or only one provided", w)
|
||||
return
|
||||
} else if width != 0 {
|
||||
size = int(width)
|
||||
} else {
|
||||
size = int(height)
|
||||
}
|
||||
if size > img.Bounds().Dx() || size > img.Bounds().Dy() {
|
||||
size = helpers.Max(img.Bounds().Dx(), img.Bounds().Dy())
|
||||
}
|
||||
|
||||
// First, make the image square by scaling the smallest dimension to the larget size
|
||||
if img.Bounds().Dx() > img.Bounds().Dy() {
|
||||
width = 0
|
||||
height = uint(size)
|
||||
} else {
|
||||
width = uint(size)
|
||||
height = 0
|
||||
}
|
||||
resized := resize.Resize(width, height, img, resize.Lanczos3)
|
||||
|
||||
// Then crop the image to the target size
|
||||
img = imaging.CropCenter(resized, size, size)
|
||||
} else {
|
||||
if width == 0 && height == 0 {
|
||||
if autoScale {
|
||||
// If both width and height parameters are not provided, set
|
||||
// the largest dimension to 300 and scale the other.
|
||||
if img.Bounds().Dx() > img.Bounds().Dy() {
|
||||
width = 300
|
||||
height = 0
|
||||
} else {
|
||||
width = 0
|
||||
height = 300
|
||||
}
|
||||
} else {
|
||||
// Don't auto-resize because this endpoint can also be used for simply reducing the quality of an image
|
||||
width = uint(img.Bounds().Dx())
|
||||
height = uint(img.Bounds().Dy())
|
||||
}
|
||||
} else if width == 0 {
|
||||
// If only width is provided, calculate the height based on the image's aspect ratio
|
||||
width = uint(img.Bounds().Dx()) * height / uint(img.Bounds().Dy())
|
||||
} else if height == 0 {
|
||||
height = uint(img.Bounds().Dy()) * width / uint(img.Bounds().Dx())
|
||||
}
|
||||
// Scale the image. If the image is smaller than the provided height or width, it won't be resized.
|
||||
img = resize.Resize(width, height, img, resize.Lanczos3)
|
||||
}
|
||||
|
||||
// Encode the image
|
||||
//buf := new(bytes.Buffer)
|
||||
//if err := png.Encode(buf, img); err != nil {
|
||||
// log.Warnf("Error encoding %s to PNG: %v", fullPath, err)
|
||||
// returnDummyPNG(w)
|
||||
// //w.Header().Set("Content-Type", "application/json")
|
||||
// //w.WriteHeader(http.StatusInternalServerError)
|
||||
// //json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
// // "code": 500,
|
||||
// // "error": "500 internal server error",
|
||||
// //})
|
||||
// return
|
||||
//}
|
||||
|
||||
buf, err := file.CompressPNGFile(img, pngQuality)
|
||||
if err != nil {
|
||||
log.Warnf("Error compressing %s to PNG: %v", fullPath, err)
|
||||
returnDummyPNG(w)
|
||||
}
|
||||
|
||||
// Return the image
|
||||
w.Header().Set("Content-Type", "image/png")
|
||||
w.Write(buf.Bytes())
|
||||
}
|
||||
|
||||
func returnDummyPNG(w http.ResponseWriter) {
|
||||
img := image.NewRGBA(image.Rect(0, 0, 300, 300))
|
||||
blue := color.RGBA{255, 255, 255, 255}
|
||||
|
||||
for y := 0; y < img.Bounds().Dy(); y++ {
|
||||
for x := 0; x < img.Bounds().Dx(); x++ {
|
||||
img.Set(x, y, blue)
|
||||
}
|
||||
}
|
||||
|
||||
buffer := new(bytes.Buffer)
|
||||
if err := png.Encode(buffer, img); err != nil {
|
||||
http.Error(w, err.Error(), http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
|
||||
// TODO: set cache-control
|
||||
|
||||
w.Header().Set("Content-Type", "image/png")
|
||||
w.Write(buffer.Bytes())
|
||||
}
|
|
@ -1,140 +0,0 @@
|
|||
package cache
|
||||
|
||||
import (
|
||||
"crazyfs/data"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"sync"
|
||||
)
|
||||
|
||||
// Config values
|
||||
var FollowSymlinks bool
|
||||
var WorkerBufferSize int
|
||||
var PrintNew bool
|
||||
var RootDir string
|
||||
var CrawlerParseMIME bool
|
||||
|
||||
type DirectoryCrawler struct {
|
||||
cache *lru.Cache[string, *data.Item]
|
||||
pool *WorkerPool
|
||||
}
|
||||
|
||||
func NewDirectoryCrawler(cache *lru.Cache[string, *data.Item], pool *WorkerPool) *DirectoryCrawler {
|
||||
return &DirectoryCrawler{cache: cache, pool: pool}
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) Crawl(path string, recursive bool) error {
|
||||
info, err := os.Stat(path)
|
||||
if os.IsNotExist(err) {
|
||||
// If the path doesn't exist, just silently exit
|
||||
return nil
|
||||
}
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Get a list of all keys in the cache that belong to this directory
|
||||
keys := make([]string, 0)
|
||||
for _, key := range dc.cache.Keys() {
|
||||
if strings.HasPrefix(key, path) {
|
||||
keys = append(keys, key)
|
||||
}
|
||||
}
|
||||
|
||||
// Remove all entries in the cache that belong to this directory so we can start fresh
|
||||
for _, key := range keys {
|
||||
dc.cache.Remove(key)
|
||||
}
|
||||
|
||||
if info.IsDir() {
|
||||
// If the path is a directory, walk the directory
|
||||
var wg sync.WaitGroup
|
||||
err := dc.walkDir(path, &wg, info, recursive)
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - dc.walkDir() in Crawl() returned error: %s", err)
|
||||
}
|
||||
} else {
|
||||
// If the path is a file, add it to the cache directly
|
||||
dc.cache.Add(StripRootDir(path, RootDir), NewItem(path, info))
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) walkDir(dir string, n *sync.WaitGroup, dirInfo os.FileInfo, recursive bool) error {
|
||||
// We are handling errors for each file or directory individually. Does this slow things down?
|
||||
|
||||
entries, err := os.ReadDir(dir)
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - walkDir() failed to read directory %s: %s", dir, err)
|
||||
return err
|
||||
}
|
||||
|
||||
// Create the directory item but don't add it to the cache yet
|
||||
dirItem := NewItem(dir, dirInfo)
|
||||
|
||||
for _, entry := range entries {
|
||||
subpath := filepath.Join(dir, entry.Name())
|
||||
info, err := os.Lstat(subpath) // update the info var with the new entry
|
||||
if err != nil {
|
||||
log.Warnf("CRAWLER - walkDir() failed to stat subpath %s: %s", subpath, err)
|
||||
continue
|
||||
}
|
||||
if FollowSymlinks && info.Mode()&os.ModeSymlink != 0 {
|
||||
link, err := os.Readlink(subpath)
|
||||
if err != nil {
|
||||
log.Warnf("CRAWLER - walkDir() failed to read symlink %s: %s", subpath, err)
|
||||
continue
|
||||
}
|
||||
info, err = os.Stat(link)
|
||||
if err != nil {
|
||||
log.Warnf("CRAWLER - walkDir() failed to stat link %s: %s", link, err)
|
||||
continue
|
||||
}
|
||||
}
|
||||
if entry.IsDir() && recursive {
|
||||
n.Add(1)
|
||||
go func() {
|
||||
defer n.Done() // Move Done() here
|
||||
err := dc.walkDir(subpath, n, info, recursive)
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - dc.walkDir() in walkDir() -> IsDir() returned error: %s", err)
|
||||
}
|
||||
}()
|
||||
} else {
|
||||
w := dc.pool.Get()
|
||||
w.add(subpath)
|
||||
dc.pool.Put(w)
|
||||
}
|
||||
|
||||
// Add the entry to the directory's contents
|
||||
entryItem := NewItem(subpath, info)
|
||||
dirItem.Children = append(dirItem.Children, entryItem)
|
||||
}
|
||||
|
||||
// Add the directory to the cache after all of its children have been processed
|
||||
dc.cache.Add(StripRootDir(dir, RootDir), dirItem)
|
||||
|
||||
// If the directory is not the root directory, update the parent directory's Children field
|
||||
if dir != RootDir {
|
||||
parentDir := filepath.Dir(dir)
|
||||
parentItem, found := dc.cache.Get(StripRootDir(parentDir, RootDir))
|
||||
if found {
|
||||
// Remove the old version of the directory from the parent's Children field
|
||||
for i, child := range parentItem.Children {
|
||||
if child.Path == StripRootDir(dir, RootDir) {
|
||||
parentItem.Children = append(parentItem.Children[:i], parentItem.Children[i+1:]...)
|
||||
break
|
||||
}
|
||||
}
|
||||
// Add the new version of the directory to the parent's Children field
|
||||
parentItem.Children = append(parentItem.Children, dirItem)
|
||||
// Update the parent directory in the cache
|
||||
dc.cache.Add(StripRootDir(parentDir, RootDir), parentItem)
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
|
@ -0,0 +1,89 @@
|
|||
package DirectoryCrawler
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/file"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
)
|
||||
|
||||
var globalActiveCrawls int32
|
||||
|
||||
type DirectoryCrawler struct {
|
||||
cache *lru.Cache[string, *CacheItem.Item]
|
||||
visited sync.Map
|
||||
wg sync.WaitGroup
|
||||
mu sync.Mutex // lock for the visted map
|
||||
}
|
||||
|
||||
func NewDirectoryCrawler(cache *lru.Cache[string, *CacheItem.Item]) *DirectoryCrawler {
|
||||
return &DirectoryCrawler{
|
||||
cache: cache,
|
||||
visited: sync.Map{},
|
||||
}
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) Get(path string) (*CacheItem.Item, bool) {
|
||||
return dc.cache.Get(path)
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) CleanupDeletedFiles(path string) {
|
||||
dc.visited.Range(func(key, value interface{}) bool {
|
||||
keyStr := key.(string)
|
||||
if isSubpath(file.StripRootDir(path), keyStr) && value.(bool) {
|
||||
dc.cache.Remove(keyStr)
|
||||
}
|
||||
return true
|
||||
})
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) AddCacheItem(fullPath string, info os.FileInfo) {
|
||||
strippedPath := file.StripRootDir(fullPath)
|
||||
item := CacheItem.NewItem(fullPath, info)
|
||||
if item != nil {
|
||||
// Sometimes CacheItem.NewItem will return nil if the path fails its checks
|
||||
dc.cache.Add(strippedPath, item)
|
||||
} else {
|
||||
//log.Errorf("NewItem returned nil for %s", fullPath)
|
||||
}
|
||||
}
|
||||
|
||||
func isSubpath(path, subpath string) bool {
|
||||
// Clean the paths to remove any redundant or relative elements
|
||||
path = filepath.Clean(path)
|
||||
subpath = filepath.Clean(subpath)
|
||||
|
||||
// Split the paths into their components
|
||||
pathParts := strings.Split(path, string(os.PathSeparator))
|
||||
subpathParts := strings.Split(subpath, string(os.PathSeparator))
|
||||
|
||||
// If the subpath has more components than the path, it cannot be a subpath
|
||||
if len(subpathParts) < len(pathParts) {
|
||||
return false
|
||||
}
|
||||
|
||||
// Compare the components of the path and the subpath
|
||||
for i, part := range pathParts {
|
||||
if part != subpathParts[i] {
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
return true
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) incrementGlobalActiveCrawls() {
|
||||
atomic.AddInt32(&globalActiveCrawls, 1)
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) decrementGlobalActiveCrawls() {
|
||||
atomic.AddInt32(&globalActiveCrawls, -1)
|
||||
}
|
||||
|
||||
func GetGlobalActiveCrawls() int32 {
|
||||
return atomic.LoadInt32(&globalActiveCrawls)
|
||||
}
|
|
@ -0,0 +1,205 @@
|
|||
package DirectoryCrawler
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
)
|
||||
|
||||
var JobQueueSize int
|
||||
|
||||
// WorkerPool is a buffered channel acting as a semaphore to limit the number of active workers globally
|
||||
var WorkerPool chan struct{}
|
||||
|
||||
// ActiveWorkers is an atomic counter for the number of active workers
|
||||
var ActiveWorkers int32
|
||||
|
||||
// ActiveWalks is an atomic counter for the number of active Walk crawls
|
||||
var ActiveWalks int32
|
||||
|
||||
// ErrNotDir indicates that the path, which is being passed
|
||||
// to a walker function, does not point to a directory
|
||||
var ErrNotDir = errors.New("not a directory")
|
||||
|
||||
// Walker is constructed for each Walk() function invocation
|
||||
type Walker struct {
|
||||
wg sync.WaitGroup
|
||||
jobs chan string
|
||||
root string
|
||||
followSymlinks bool
|
||||
walkFunc filepath.WalkFunc
|
||||
}
|
||||
|
||||
// the readDirNames function below was taken from the original
|
||||
// implementation (see https://golang.org/src/path/filepath/path.go)
|
||||
// but has sorting removed (sorting doesn't make sense
|
||||
// in concurrent execution, anyway)
|
||||
|
||||
// readDirNames reads the directory named by dirname and returns
|
||||
// a list of directory entries.
|
||||
func readDirNames(dirname string) ([]string, error) {
|
||||
f, err := os.Open(dirname)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
defer func() {
|
||||
cerr := f.Close()
|
||||
if err == nil {
|
||||
err = cerr
|
||||
}
|
||||
}()
|
||||
|
||||
names, err := f.Readdirnames(-1)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return names, nil
|
||||
}
|
||||
|
||||
// lstat is a wrapper for os.Lstat which accepts a path
|
||||
// relative to Walker.root and also follows symlinks
|
||||
func (w *Walker) lstat(relpath string) (info os.FileInfo, err error) {
|
||||
path := filepath.Join(w.root, relpath)
|
||||
info, err = os.Lstat(path)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
// check if this is a symlink
|
||||
if w.followSymlinks {
|
||||
if info.Mode()&os.ModeSymlink > 0 {
|
||||
path, err = filepath.EvalSymlinks(path)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
info, err = os.Lstat(path)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
// processPath processes one directory and adds
|
||||
// its subdirectories to the queue for further processing
|
||||
func (w *Walker) processPath(relpath string) error {
|
||||
defer w.wg.Done()
|
||||
|
||||
path := filepath.Join(w.root, relpath)
|
||||
names, err := readDirNames(path)
|
||||
if err != nil {
|
||||
log.Errorf("Walker - processPath - readDirNames - %s", err)
|
||||
return err
|
||||
}
|
||||
|
||||
for _, name := range names {
|
||||
subpath := filepath.Join(relpath, name)
|
||||
info, err := w.lstat(subpath)
|
||||
|
||||
if err != nil {
|
||||
log.Warnf("processPath - %s - %s", relpath, err)
|
||||
continue
|
||||
}
|
||||
|
||||
if info == nil {
|
||||
log.Warnf("processPath - %s - %s", relpath, err)
|
||||
continue
|
||||
}
|
||||
|
||||
w.walkFunc(filepath.Join(w.root, subpath), info, err)
|
||||
|
||||
//if err == filepath.SkipDir {
|
||||
// return nil
|
||||
//}
|
||||
|
||||
if info.Mode().IsDir() {
|
||||
w.addJob(subpath)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// addJob increments the job counter
|
||||
// and pushes the path to the jobs channel
|
||||
func (w *Walker) addJob(path string) {
|
||||
w.wg.Add(1)
|
||||
select {
|
||||
// try to push the job to the channel
|
||||
case w.jobs <- path: // ok
|
||||
default: // buffer overflow
|
||||
// process job synchronously
|
||||
err := w.processPath(path)
|
||||
if err != nil {
|
||||
log.Warnf("addJob - %s - %s", path, err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// worker processes all the jobs
|
||||
// until the jobs channel is explicitly closed
|
||||
func (w *Walker) worker() {
|
||||
for path := range w.jobs {
|
||||
WorkerPool <- struct{}{} // acquire a worker
|
||||
atomic.AddInt32(&ActiveWorkers, 1) // increment the number of active workers
|
||||
|
||||
err := w.processPath(path)
|
||||
if err != nil {
|
||||
log.Warnf("worker - %s", err)
|
||||
}
|
||||
|
||||
<-WorkerPool // release the worker when done
|
||||
atomic.AddInt32(&ActiveWorkers, -1) // decrement the number of active workers
|
||||
}
|
||||
}
|
||||
|
||||
// Walk recursively descends into subdirectories,
|
||||
// calling walkFn for each file or directory
|
||||
// in the tree, including the root directory.
|
||||
func (w *Walker) Walk(relpath string, walkFn filepath.WalkFunc) error {
|
||||
atomic.AddInt32(&ActiveWalks, 1) // increment the number of active Walk crawls
|
||||
defer atomic.AddInt32(&ActiveWalks, -1) // decrement the number of active Walk crawls when done
|
||||
|
||||
w.jobs = make(chan string, JobQueueSize)
|
||||
w.walkFunc = walkFn
|
||||
|
||||
info, err := w.lstat(relpath)
|
||||
err = w.walkFunc(filepath.Join(w.root, relpath), info, err)
|
||||
if err == filepath.SkipDir {
|
||||
return nil
|
||||
}
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
if info == nil {
|
||||
return fmt.Errorf("broken symlink: %s", relpath)
|
||||
}
|
||||
|
||||
if !info.Mode().IsDir() {
|
||||
return ErrNotDir
|
||||
}
|
||||
|
||||
// spawn workers
|
||||
for n := 1; n <= JobQueueSize; n++ {
|
||||
go w.worker()
|
||||
}
|
||||
|
||||
w.addJob(relpath) // add this path as a first job
|
||||
w.wg.Wait() // wait till all paths are processed
|
||||
close(w.jobs) // signal workers to close
|
||||
return nil
|
||||
}
|
||||
|
||||
// Walk is a wrapper function for the Walker object
|
||||
// that mimics the behavior of filepath.Walk,
|
||||
// and doesn't follow symlinks.
|
||||
func Walk(root string, followSymlinks bool, walkFn filepath.WalkFunc) error {
|
||||
w := Walker{
|
||||
root: root,
|
||||
followSymlinks: followSymlinks,
|
||||
}
|
||||
return w.Walk("", walkFn)
|
||||
}
|
|
@ -0,0 +1,134 @@
|
|||
package DirectoryCrawler
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/config"
|
||||
"crazyfs/file"
|
||||
"os"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func (dc *DirectoryCrawler) walkRecursiveFunc(path string, info os.FileInfo, err error) error {
|
||||
dc.processPath(path, info)
|
||||
return nil
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) walkNonRecursiveFunc(path string, dir os.DirEntry, err error) error {
|
||||
info, err := dir.Info()
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - walkNonRecursiveFunc() - get info failed - %s", err)
|
||||
return err
|
||||
}
|
||||
dc.processPath(path, info)
|
||||
return nil
|
||||
}
|
||||
|
||||
func (dc *DirectoryCrawler) Crawl(fullPath string, shouldBlock bool) error {
|
||||
info, err := os.Lstat(fullPath)
|
||||
if os.IsNotExist(err) {
|
||||
// If the path doesn't exist, just silently exit
|
||||
return err
|
||||
}
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - Crawl() - os.Lstat() failed - %s", err)
|
||||
return err
|
||||
}
|
||||
//if !config.FollowSymlinks && info.Mode()&os.ModeSymlink > 0 {
|
||||
// msg := fmt.Sprintf("CRAWL - tried to crawl a symlink (not allowed in config): %s", fullPath)
|
||||
// log.Warnf(msg)
|
||||
// return errors.New(msg)
|
||||
//}
|
||||
relPath := file.StripRootDir(fullPath)
|
||||
|
||||
dc.cache.Remove(relPath)
|
||||
|
||||
if info.IsDir() {
|
||||
// Get a list of all keys in the cache that belong to this directory
|
||||
keys := make([]string, 0)
|
||||
for _, key := range dc.cache.Keys() {
|
||||
if isSubpath(fullPath, key) {
|
||||
keys = append(keys, key)
|
||||
}
|
||||
}
|
||||
|
||||
// Remove all entries in the cache that belong to this directory so we can start fresh
|
||||
for _, key := range keys {
|
||||
dc.cache.Remove(key)
|
||||
}
|
||||
|
||||
// If the path is a directory, start a walk
|
||||
err := Walk(fullPath, config.FollowSymlinks, dc.walkRecursiveFunc)
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - crawl for %s failed: %s", fullPath, err)
|
||||
}
|
||||
|
||||
// After crawling, remove any keys that are still in the list (these are items that were not found on the filesystem)
|
||||
//dc.CleanupDeletedFiles(path)
|
||||
} else {
|
||||
// If the path is a file, add it to the cache directly
|
||||
dc.AddCacheItem(relPath, info)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// CrawlNoRecursion this function crawls a file or directory and does not recurse into any subdirectories. Also returns the result of the crawl.
|
||||
func (dc *DirectoryCrawler) CrawlNoRecursion(fullPath string) (*CacheItem.Item, error) {
|
||||
info, err := os.Lstat(fullPath)
|
||||
if os.IsNotExist(err) {
|
||||
// If the path doesn't exist, just silently exit
|
||||
return nil, err
|
||||
}
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - CrawlNoRecursion() - os.Lstat() failed - %s", err)
|
||||
return nil, err
|
||||
}
|
||||
//if !config.FollowSymlinks && info.Mode()&os.ModeSymlink > 0 {
|
||||
// msg := fmt.Sprintf("CRAWL - tried to crawl a symlink (not allowed in config): %s", fullPath)
|
||||
// log.Warnf(msg)
|
||||
// return nil, errors.New(msg)
|
||||
//}
|
||||
|
||||
var item *CacheItem.Item
|
||||
relPath := file.StripRootDir(fullPath)
|
||||
|
||||
dc.cache.Remove(relPath)
|
||||
|
||||
if info.IsDir() {
|
||||
// Get a list of all keys in the cache that belong to this directory
|
||||
keys := make([]string, 0)
|
||||
for _, key := range dc.cache.Keys() {
|
||||
if isSubpath(fullPath, key) {
|
||||
keys = append(keys, key)
|
||||
}
|
||||
}
|
||||
|
||||
// Remove all entries in the cache that belong to this directory so we can start fresh
|
||||
for _, key := range keys {
|
||||
dc.cache.Remove(key)
|
||||
}
|
||||
|
||||
err := filepath.WalkDir(fullPath, dc.walkNonRecursiveFunc)
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - non-recursive crawl for %s failed: %s", fullPath, err)
|
||||
return nil, err
|
||||
}
|
||||
item, _ = dc.cache.Get(relPath)
|
||||
} else {
|
||||
item = CacheItem.NewItem(fullPath, info)
|
||||
dc.AddCacheItem(fullPath, info)
|
||||
}
|
||||
return item, nil
|
||||
}
|
||||
|
||||
func removeOldDir(children []string, strippedDir string) ([]string, bool) {
|
||||
newChildren := make([]string, 0, len(children))
|
||||
foundOldDir := false
|
||||
for _, child := range children {
|
||||
if child != strippedDir {
|
||||
newChildren = append(newChildren, child)
|
||||
} else {
|
||||
foundOldDir = true
|
||||
}
|
||||
}
|
||||
return newChildren, foundOldDir
|
||||
}
|
|
@ -0,0 +1,56 @@
|
|||
package DirectoryCrawler
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/config"
|
||||
"crazyfs/file"
|
||||
"os"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func (dc *DirectoryCrawler) processPath(fullPath string, info os.FileInfo) error {
|
||||
relPath := file.StripRootDir(fullPath)
|
||||
|
||||
dc.visited.Store(relPath, true)
|
||||
|
||||
if info.Mode().IsDir() {
|
||||
dirItem := CacheItem.NewItem(fullPath, info)
|
||||
|
||||
children, err := os.ReadDir(fullPath)
|
||||
if err != nil {
|
||||
log.Errorf("CRAWLER - processPath() failed to read directory %s: %s", fullPath, err)
|
||||
}
|
||||
|
||||
for _, entry := range children {
|
||||
subpath := filepath.Clean(filepath.Join(fullPath, entry.Name()))
|
||||
dirItem.Children = append(dirItem.Children, file.StripRootDir(subpath))
|
||||
}
|
||||
|
||||
// Add the directory to the cache after all of its children have been processed
|
||||
dc.cache.Add(relPath, dirItem)
|
||||
|
||||
// If the directory is not the root directory, update the parent directory's Children field
|
||||
// This block of code ensures that the parent directory's Children field is always up-to-date with
|
||||
// the current state of its subdirectories. It removes any old versions of the current directory
|
||||
// from the parent's Children field and adds the new version.
|
||||
if fullPath != config.RootDir {
|
||||
parentDir := filepath.Dir(fullPath)
|
||||
strippedParentDir := file.StripRootDir(parentDir)
|
||||
parentItem, found := dc.cache.Get(strippedParentDir)
|
||||
if found {
|
||||
// Remove the old version of the directory from the parent's Children field
|
||||
newChildren, foundOldDir := removeOldDir(parentItem.Children, relPath)
|
||||
// Add the new version of the directory to the parent's Children field only if it wasn't found
|
||||
if !foundOldDir {
|
||||
parentItem.Children = append(newChildren, relPath)
|
||||
}
|
||||
// Update the parent directory in the cache
|
||||
dc.cache.Add(strippedParentDir, parentItem)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// Path is a file
|
||||
dc.AddCacheItem(fullPath, info)
|
||||
}
|
||||
return nil
|
||||
}
|
|
@ -0,0 +1,12 @@
|
|||
package DirectoryCrawler
|
||||
|
||||
import (
|
||||
"crazyfs/logging"
|
||||
"github.com/sirupsen/logrus"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
}
|
|
@ -1,46 +0,0 @@
|
|||
package cache
|
||||
|
||||
import (
|
||||
"crazyfs/logging"
|
||||
"os"
|
||||
)
|
||||
|
||||
type Worker struct {
|
||||
id int
|
||||
ch chan string
|
||||
active bool
|
||||
}
|
||||
|
||||
func newWorker(id int) *Worker {
|
||||
return &Worker{
|
||||
id: id,
|
||||
ch: make(chan string, WorkerBufferSize),
|
||||
active: false,
|
||||
}
|
||||
}
|
||||
|
||||
func (w *Worker) start(dc *DirectoryCrawler) {
|
||||
w.active = true
|
||||
go func() {
|
||||
for path := range w.ch {
|
||||
info, err := os.Stat(path)
|
||||
if err != nil {
|
||||
logger := logging.GetLogger()
|
||||
logger.Errorf("WORKER START - os.Stat() - %s", err)
|
||||
continue
|
||||
}
|
||||
dc.cache.Add(StripRootDir(path, RootDir), NewItem(path, info))
|
||||
}
|
||||
w.active = false
|
||||
// Release the token back to the semaphore when the worker is done
|
||||
<-WorkerSemaphore
|
||||
}()
|
||||
}
|
||||
|
||||
func (w *Worker) add(path string) {
|
||||
w.ch <- path
|
||||
}
|
||||
|
||||
func (w *Worker) stop() {
|
||||
close(w.ch)
|
||||
}
|
|
@ -1,44 +0,0 @@
|
|||
package cache
|
||||
|
||||
import "sync"
|
||||
|
||||
var WorkerSemaphore chan struct{}
|
||||
|
||||
type WorkerPool struct {
|
||||
pool chan *Worker
|
||||
wg sync.WaitGroup
|
||||
}
|
||||
|
||||
func NewWorkerPool() *WorkerPool {
|
||||
return &WorkerPool{
|
||||
pool: make(chan *Worker, cap(WorkerSemaphore)), // use the capacity of the semaphore as the size of the pool
|
||||
}
|
||||
}
|
||||
|
||||
func (p *WorkerPool) Get() *Worker {
|
||||
select {
|
||||
case w := <-p.pool:
|
||||
return w
|
||||
default:
|
||||
// Acquire a token from the semaphore
|
||||
WorkerSemaphore <- struct{}{}
|
||||
return newWorker(len(p.pool))
|
||||
}
|
||||
}
|
||||
|
||||
func (p *WorkerPool) Put(w *Worker) {
|
||||
select {
|
||||
case p.pool <- w:
|
||||
default:
|
||||
// If the pool is full, discard the worker and release the token back to the semaphore
|
||||
<-WorkerSemaphore
|
||||
}
|
||||
}
|
||||
|
||||
func (p *WorkerPool) Wait() {
|
||||
p.wg.Wait()
|
||||
}
|
||||
|
||||
func (p *WorkerPool) Add(delta int) {
|
||||
p.wg.Add(delta)
|
||||
}
|
|
@ -1,123 +1,67 @@
|
|||
package cache
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/logging"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"os"
|
||||
"github.com/sirupsen/logrus"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
var itemPool = &sync.Pool{
|
||||
New: func() interface{} {
|
||||
return &data.Item{}
|
||||
},
|
||||
var log *logrus.Logger
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
}
|
||||
|
||||
func StartCrawler(basePath string, sharedCache *lru.Cache[string, *data.Item], cfg *config.Config) error {
|
||||
log = logging.GetLogger()
|
||||
func StartCrawler(sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config) error {
|
||||
var wg sync.WaitGroup
|
||||
crawlerChan := make(chan struct{}, cfg.DirectoryCrawlers)
|
||||
|
||||
// TODO: a crawl may take some time to complete so we need to adjust the wait time based on the duration it took
|
||||
go func() {
|
||||
ticker := time.NewTicker(time.Second * time.Duration(cfg.CrawlModeCrawlInterval))
|
||||
defer ticker.Stop()
|
||||
|
||||
// delay before first crawl
|
||||
time.Sleep(time.Second * time.Duration(cfg.CrawlModeCrawlInterval))
|
||||
|
||||
for {
|
||||
select {
|
||||
case <-ticker.C:
|
||||
crawlerChan <- struct{}{} // block if there are already cfg.DirectoryCrawlers crawlers
|
||||
wg.Add(1)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
|
||||
pool := NewWorkerPool()
|
||||
crawler := NewDirectoryCrawler(sharedCache, pool)
|
||||
log.Infoln("CRAWLER - Starting a crawl...")
|
||||
start := time.Now()
|
||||
err := crawler.Crawl(basePath, true)
|
||||
duration := time.Since(start).Round(time.Second)
|
||||
if err != nil {
|
||||
log.Warnf("CRAWLER - Crawl failed: %s", err)
|
||||
} else {
|
||||
log.Infof("CRAWLER - Crawl completed in %s", duration)
|
||||
keys := sharedCache.Keys()
|
||||
log.Debugf("%d/%d items in the cache.", cfg.CacheSize, len(keys))
|
||||
}
|
||||
<-crawlerChan // release
|
||||
}()
|
||||
}
|
||||
}
|
||||
}()
|
||||
go startCrawl(cfg, sharedCache, &wg, crawlerChan)
|
||||
|
||||
ticker := time.NewTicker(60 * time.Second)
|
||||
go func(c *lru.Cache[string, *data.Item]) {
|
||||
for range ticker.C {
|
||||
keys := c.Keys()
|
||||
log.Debugf("%d/%d items in the cache.", len(keys), cfg.CacheSize)
|
||||
//fmt.Println(keys) // for debug when things are really messed up
|
||||
}
|
||||
}(sharedCache)
|
||||
go logCacheStatus("CACHE STATUS", ticker, sharedCache, cfg, log.Debugf)
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func NewItem(path string, info os.FileInfo) *data.Item {
|
||||
if PrintNew {
|
||||
log = logging.GetLogger()
|
||||
log.Debugf("CACHE - new: %s", path)
|
||||
func startCrawl(cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item], wg *sync.WaitGroup, crawlerChan chan struct{}) {
|
||||
ticker := time.NewTicker(time.Duration(cfg.CrawlModeCrawlInterval) * time.Second)
|
||||
defer ticker.Stop()
|
||||
|
||||
time.Sleep(time.Duration(cfg.CrawlModeCrawlInterval) * time.Second)
|
||||
|
||||
for range ticker.C {
|
||||
crawlerChan <- struct{}{}
|
||||
wg.Add(1)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
dc := DirectoryCrawler.NewDirectoryCrawler(sharedCache)
|
||||
log.Infoln("CRAWLER - Starting a crawl...")
|
||||
start := time.Now()
|
||||
err := dc.Crawl(cfg.RootDir, true)
|
||||
duration := time.Since(start).Round(time.Second)
|
||||
if err != nil {
|
||||
log.Warnf("CRAWLER - Crawl failed: %s", err)
|
||||
} else {
|
||||
log.Infof("CRAWLER - Crawl completed in %s", duration)
|
||||
log.Debugf("%d/%d items in the cache.", cfg.CacheSize, len(sharedCache.Keys()))
|
||||
}
|
||||
<-crawlerChan
|
||||
}()
|
||||
}
|
||||
}
|
||||
|
||||
func logCacheStatus(msg string, ticker *time.Ticker, sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config, logFn func(format string, args ...interface{})) {
|
||||
defer ticker.Stop()
|
||||
for range ticker.C {
|
||||
activeWorkers := int(DirectoryCrawler.ActiveWorkers)
|
||||
busyWorkers := int(DirectoryCrawler.ActiveWalks)
|
||||
logFn("%s - %d/%d items in the cache. Active workers: %d Active crawls: %d", msg, len(sharedCache.Keys()), cfg.CacheSize, activeWorkers, busyWorkers)
|
||||
//fmt.Println(sharedCache.Keys())
|
||||
}
|
||||
|
||||
// Start processing the MIME type right away.
|
||||
// It will run in the background while we set up the Item object.
|
||||
ch := make(chan [2]string)
|
||||
go AnalyzeFileMime(path, info, CrawlerParseMIME, ch)
|
||||
|
||||
item := itemPool.Get().(*data.Item)
|
||||
|
||||
// Reset fields
|
||||
item.Path = ""
|
||||
item.Name = ""
|
||||
item.Size = 0
|
||||
item.Extension = nil
|
||||
item.Modified = ""
|
||||
item.Mode = 0
|
||||
item.IsDir = false
|
||||
item.IsSymlink = false
|
||||
item.Cached = 0
|
||||
item.Children = item.Children[:0]
|
||||
item.Type = nil
|
||||
|
||||
// Set fields
|
||||
item.Path = StripRootDir(path, RootDir)
|
||||
item.Name = info.Name()
|
||||
item.Size = info.Size()
|
||||
item.Modified = info.ModTime().UTC().Format(time.RFC3339Nano)
|
||||
item.Mode = uint32(info.Mode().Perm())
|
||||
item.IsDir = info.IsDir()
|
||||
item.IsSymlink = info.Mode()&os.ModeSymlink != 0
|
||||
item.Cached = time.Now().UnixNano() / int64(time.Millisecond)
|
||||
|
||||
// Get the MIME data from the background thread
|
||||
mimeResult := <-ch // This will block until the goroutine finishes
|
||||
ext, mimeType := mimeResult[0], mimeResult[1]
|
||||
|
||||
// Create pointers for mimeType and ext to allow empty JSON strings
|
||||
var mimeTypePtr, extPtr *string
|
||||
if mimeType != "" {
|
||||
mimeTypePtr = &mimeType
|
||||
}
|
||||
if ext != "" {
|
||||
extPtr = &ext
|
||||
}
|
||||
item.Extension = extPtr
|
||||
item.Type = mimeTypePtr
|
||||
|
||||
return item
|
||||
}
|
||||
|
|
|
@ -1,69 +0,0 @@
|
|||
package cache
|
||||
|
||||
import (
|
||||
"github.com/gabriel-vasile/mimetype"
|
||||
"mime"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func StripRootDir(path, RootDir string) string {
|
||||
if path == "/" || path == RootDir {
|
||||
// Avoid erasing our path
|
||||
return "/"
|
||||
} else {
|
||||
return strings.TrimSuffix(strings.TrimPrefix(path, RootDir), "/")
|
||||
}
|
||||
}
|
||||
|
||||
func GetFileMime(path string, analyze bool) (bool, string, string, error) {
|
||||
var err error
|
||||
info, err := os.Stat(path)
|
||||
if err != nil {
|
||||
// File does not exist
|
||||
return false, "", "", err
|
||||
}
|
||||
ch := make(chan [2]string)
|
||||
go AnalyzeFileMime(path, info, analyze, ch)
|
||||
|
||||
// Get the MIME data from the background thread
|
||||
mimeResult := <-ch // This will block until the goroutine finishes
|
||||
ext, mimeType := mimeResult[0], mimeResult[1]
|
||||
return true, mimeType, ext, nil
|
||||
}
|
||||
|
||||
func detectMIME(path string, info os.FileInfo) string {
|
||||
if info.Mode()&os.ModeType == 0 {
|
||||
mimeObj, err := mimetype.DetectFile(path)
|
||||
if err != nil {
|
||||
log.Warnf("Error detecting MIME type: %v", err)
|
||||
return ""
|
||||
} else {
|
||||
return mimeObj.String()
|
||||
}
|
||||
} else {
|
||||
return ""
|
||||
}
|
||||
}
|
||||
|
||||
func AnalyzeFileMime(path string, info os.FileInfo, analyze bool, ch chan<- [2]string) {
|
||||
go func() {
|
||||
var ext string
|
||||
var mimeType string
|
||||
if !info.IsDir() && !(info.Mode()&os.ModeSymlink == os.ModeSymlink) {
|
||||
if CrawlerParseMIME || analyze {
|
||||
ext = filepath.Ext(path)
|
||||
mimeType = detectMIME(path, info)
|
||||
} else {
|
||||
mimeType = mime.TypeByExtension(ext)
|
||||
}
|
||||
if strings.Contains(mimeType, ";") {
|
||||
mimeType = strings.Split(mimeType, ";")[0]
|
||||
}
|
||||
ch <- [2]string{ext, mimeType}
|
||||
} else {
|
||||
ch <- [2]string{"", ""}
|
||||
}
|
||||
}()
|
||||
}
|
|
@ -1,12 +1,11 @@
|
|||
package cache
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/logging"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"runtime"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
|
@ -16,62 +15,22 @@ func init() {
|
|||
InitialCrawlInProgress = false
|
||||
}
|
||||
|
||||
func InitialCrawl(sharedCache *lru.Cache[string, *data.Item], cfg *config.Config) {
|
||||
func InitialCrawl(sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config) {
|
||||
log = logging.GetLogger()
|
||||
|
||||
log.Infof("INITIAL CRAWL - starting the crawl for %s", config.RootDir)
|
||||
|
||||
ticker := time.NewTicker(3 * time.Second)
|
||||
go logCacheStatus("INITIAL CRAWL", ticker, sharedCache, cfg, log.Infof)
|
||||
|
||||
InitialCrawlInProgress = true
|
||||
dirChan := make(chan string, 100000) // Buffered channel to hold directories to be crawled
|
||||
var wg sync.WaitGroup
|
||||
cacheFull := make(chan bool, 1) // Channel to signal when cache is full
|
||||
|
||||
// Start Worker goroutines
|
||||
for i := 0; i < runtime.NumCPU()*6; i++ {
|
||||
wg.Add(1)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
for {
|
||||
select {
|
||||
case dir, ok := <-dirChan:
|
||||
if ok {
|
||||
crawlDir(dir, sharedCache, cacheFull, cfg)
|
||||
} else {
|
||||
return
|
||||
}
|
||||
case <-cacheFull:
|
||||
return
|
||||
}
|
||||
}
|
||||
}()
|
||||
}
|
||||
|
||||
// Kick off the crawl
|
||||
dirChan <- cfg.RootDir
|
||||
close(dirChan)
|
||||
|
||||
// Start a ticker to log the number of items in the cache every 2 seconds
|
||||
ticker := time.NewTicker(2 * time.Second)
|
||||
go func() {
|
||||
for range ticker.C {
|
||||
log.Debugf("INITIAL CRAWL - cache size: %d/%d", sharedCache.Len(), cfg.CacheSize)
|
||||
}
|
||||
}()
|
||||
|
||||
// Wait for all goroutines to finish
|
||||
wg.Wait()
|
||||
ticker.Stop()
|
||||
InitialCrawlInProgress = false
|
||||
}
|
||||
|
||||
func crawlDir(dir string, sharedCache *lru.Cache[string, *data.Item], cacheFull chan<- bool, cfg *config.Config) {
|
||||
pool := NewWorkerPool()
|
||||
crawler := NewDirectoryCrawler(sharedCache, pool)
|
||||
err := crawler.Crawl(dir, true)
|
||||
dc := DirectoryCrawler.NewDirectoryCrawler(sharedCache)
|
||||
//start := time.Now()
|
||||
err := dc.Crawl(config.RootDir, true)
|
||||
if err != nil {
|
||||
log.Fatalf("Crawl failed: %s", err)
|
||||
return
|
||||
}
|
||||
|
||||
// Check if cache is full
|
||||
if sharedCache.Len() >= cfg.CacheSize {
|
||||
cacheFull <- true
|
||||
log.Errorf("LIST - background recursive crawl failed: %s", err)
|
||||
}
|
||||
InitialCrawlInProgress = false
|
||||
ticker.Stop()
|
||||
//log.Infof("INITIAL CRAWL - finished the initial crawl in %s", time.Since(start).Round(time.Second))
|
||||
}
|
||||
|
|
|
@ -1,8 +1,10 @@
|
|||
package cache
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/file"
|
||||
"crazyfs/logging"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"os"
|
||||
|
@ -16,7 +18,7 @@ func InitRecacheSemaphore(limit int) {
|
|||
sem = make(chan struct{}, limit)
|
||||
}
|
||||
|
||||
func CheckAndRecache(path string, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
func CheckAndRecache(path string, cfg *config.Config, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
item, found := sharedCache.Get(path)
|
||||
if found && time.Now().UnixNano()/int64(time.Millisecond)-item.Cached > int64(cfg.CacheTime)*60*1000 {
|
||||
log := logging.GetLogger()
|
||||
|
@ -24,9 +26,8 @@ func CheckAndRecache(path string, cfg *config.Config, sharedCache *lru.Cache[str
|
|||
sem <- struct{}{} // acquire a token
|
||||
go func() {
|
||||
defer func() { <-sem }() // release the token when done
|
||||
pool := NewWorkerPool()
|
||||
crawler := NewDirectoryCrawler(sharedCache, pool)
|
||||
err := crawler.Crawl(path, true)
|
||||
dc := DirectoryCrawler.NewDirectoryCrawler(sharedCache)
|
||||
err := dc.Crawl(path, true)
|
||||
if err != nil {
|
||||
log.Errorf("RECACHE ERROR: %s", err.Error())
|
||||
}
|
||||
|
@ -34,26 +35,27 @@ func CheckAndRecache(path string, cfg *config.Config, sharedCache *lru.Cache[str
|
|||
}
|
||||
}
|
||||
|
||||
func Recache(path string, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
func Recache(path string, sharedCache *lru.Cache[string, *CacheItem.Item]) {
|
||||
log := logging.GetLogger()
|
||||
log.Debugf("Re-caching: %s", path)
|
||||
start := time.Now()
|
||||
sem <- struct{}{} // acquire a token
|
||||
go func() {
|
||||
defer func() { <-sem }() // release the token when done
|
||||
pool := NewWorkerPool()
|
||||
crawler := NewDirectoryCrawler(sharedCache, pool)
|
||||
err := crawler.Crawl(path, true)
|
||||
dc := DirectoryCrawler.NewDirectoryCrawler(sharedCache)
|
||||
err := dc.Crawl(path, true)
|
||||
if err != nil {
|
||||
log.Errorf("RECACHE ERROR: %s", err.Error())
|
||||
}
|
||||
|
||||
// Get the parent directory from the cache
|
||||
parentDir := filepath.Dir(path)
|
||||
parentItem, found := sharedCache.Get(parentDir)
|
||||
parentDirRel := file.StripRootDir(parentDir)
|
||||
parentItem, found := sharedCache.Get(parentDirRel)
|
||||
if found {
|
||||
// Remove the old sub-directory from the parent directory's Children field
|
||||
for i, child := range parentItem.Children {
|
||||
if child.Path == path {
|
||||
if child == path {
|
||||
parentItem.Children = append(parentItem.Children[:i], parentItem.Children[i+1:]...)
|
||||
break
|
||||
}
|
||||
|
@ -64,16 +66,16 @@ func Recache(path string, cfg *config.Config, sharedCache *lru.Cache[string, *da
|
|||
if err != nil {
|
||||
log.Errorf("RECACHE ERROR: %s", err.Error())
|
||||
} else {
|
||||
newItem := NewItem(path, info)
|
||||
newItem := CacheItem.NewItem(path, info)
|
||||
// Create a new slice that contains all items from the Children field except the old directory
|
||||
newChildren := make([]*data.Item, 0, len(parentItem.Children))
|
||||
newChildren := make([]string, 0, len(parentItem.Children))
|
||||
for _, child := range parentItem.Children {
|
||||
if child.Path != newItem.Path {
|
||||
if child != newItem.Path {
|
||||
newChildren = append(newChildren, child)
|
||||
}
|
||||
}
|
||||
// Append the new directory to the newChildren slice
|
||||
newChildren = append(newChildren, newItem)
|
||||
newChildren = append(newChildren, newItem.Path)
|
||||
// Assign the newChildren slice to the Children field
|
||||
parentItem.Children = newChildren
|
||||
// Update the parent directory in the cache
|
||||
|
@ -81,10 +83,13 @@ func Recache(path string, cfg *config.Config, sharedCache *lru.Cache[string, *da
|
|||
}
|
||||
} else {
|
||||
// If the parent directory isn't in the cache, crawl it
|
||||
err := crawler.Crawl(parentDir, true)
|
||||
log.Infof("RECACHE - crawling parent directory since it isn't in the cache yet: %s", parentDir)
|
||||
err := dc.Crawl(parentDir, true)
|
||||
if err != nil {
|
||||
log.Errorf("RECACHE ERROR: %s", err.Error())
|
||||
}
|
||||
}
|
||||
duration := time.Since(start).Round(time.Second)
|
||||
log.Infof("RECACHE - completed in %s - %s", duration, path)
|
||||
}()
|
||||
}
|
||||
|
|
|
@ -0,0 +1,98 @@
|
|||
package cache
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/config"
|
||||
"encoding/gob"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func SearchLRU(queryString string, excludeElements []string, limitResults int, sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config) []*CacheItem.Item {
|
||||
results := make([]*CacheItem.Item, 0)
|
||||
|
||||
const maxGoroutines = 100
|
||||
|
||||
// Create a buffered channel as a semaphore
|
||||
sem := make(chan struct{}, maxGoroutines)
|
||||
|
||||
resultsChan := make(chan *CacheItem.Item, len(sharedCache.Keys()))
|
||||
|
||||
for _, key := range sharedCache.Keys() {
|
||||
searchKey(key, queryString, excludeElements, sem, resultsChan, sharedCache, cfg)
|
||||
}
|
||||
|
||||
// Wait for all goroutines to finish
|
||||
for i := 0; i < maxGoroutines; i++ {
|
||||
sem <- struct{}{}
|
||||
}
|
||||
|
||||
for range sharedCache.Keys() {
|
||||
item := <-resultsChan
|
||||
if item != nil {
|
||||
results = append(results, item)
|
||||
if (limitResults > 0 && len(results) == limitResults) || len(results) >= cfg.ApiSearchMaxResults {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return results
|
||||
}
|
||||
|
||||
func searchKey(key string, queryString string, excludeElements []string, sem chan struct{}, resultsChan chan *CacheItem.Item, sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config) {
|
||||
// Acquire a token
|
||||
sem <- struct{}{}
|
||||
|
||||
go func() {
|
||||
// Release the token at the end
|
||||
defer func() { <-sem }()
|
||||
|
||||
cacheItem, found := sharedCache.Get(key)
|
||||
if !found {
|
||||
resultsChan <- nil
|
||||
return
|
||||
}
|
||||
lowerKey := strings.ToLower(key)
|
||||
|
||||
if strings.Contains(lowerKey, queryString) {
|
||||
// check if key contains any of the exclude elements
|
||||
shouldExclude := false
|
||||
for _, exclude := range excludeElements {
|
||||
if strings.Contains(lowerKey, exclude) {
|
||||
shouldExclude = true
|
||||
break
|
||||
}
|
||||
}
|
||||
if shouldExclude {
|
||||
resultsChan <- nil
|
||||
return
|
||||
}
|
||||
|
||||
// Create a deep copy of the CacheItem
|
||||
var buf bytes.Buffer
|
||||
enc := gob.NewEncoder(&buf)
|
||||
dec := gob.NewDecoder(&buf)
|
||||
err := enc.Encode(cacheItem)
|
||||
if err != nil {
|
||||
log.Printf("Error encoding CacheItem: %v", err)
|
||||
resultsChan <- nil
|
||||
return
|
||||
}
|
||||
var item CacheItem.Item
|
||||
err = dec.Decode(&item)
|
||||
if err != nil {
|
||||
log.Printf("Error decoding CacheItem: %v", err)
|
||||
resultsChan <- nil
|
||||
return
|
||||
}
|
||||
if !cfg.ApiSearchShowChildren {
|
||||
item.Children = nil // erase the children dict
|
||||
}
|
||||
resultsChan <- &item
|
||||
} else {
|
||||
//resultsChan <- nil
|
||||
}
|
||||
}()
|
||||
}
|
|
@ -1,21 +1,17 @@
|
|||
package cache
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/logging"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"github.com/radovskyb/watcher"
|
||||
"github.com/sirupsen/logrus"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
|
||||
func StartWatcher(basePath string, sharedCache *lru.Cache[string, *data.Item], cfg *config.Config) (*watcher.Watcher, error) {
|
||||
log = logging.GetLogger()
|
||||
func StartWatcher(basePath string, sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config) (*watcher.Watcher, error) {
|
||||
w := watcher.New()
|
||||
var wg sync.WaitGroup
|
||||
crawlerChan := make(chan struct{}, cfg.DirectoryCrawlers) // limit to cfg.DirectoryCrawlers concurrent crawlers
|
||||
|
@ -65,9 +61,8 @@ func StartWatcher(basePath string, sharedCache *lru.Cache[string, *data.Item], c
|
|||
wg.Add(1)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
pool := NewWorkerPool()
|
||||
crawler := NewDirectoryCrawler(sharedCache, pool)
|
||||
err := crawler.Crawl(event.Path, true)
|
||||
dc := DirectoryCrawler.NewDirectoryCrawler(sharedCache)
|
||||
err := dc.Crawl(event.Path, true)
|
||||
if err != nil {
|
||||
log.Warnf("WATCHER - Crawl failed: %s", err)
|
||||
}
|
||||
|
@ -95,7 +90,7 @@ func StartWatcher(basePath string, sharedCache *lru.Cache[string, *data.Item], c
|
|||
|
||||
// Print the filenames of the cache entries every 5 seconds
|
||||
ticker := time.NewTicker(60 * time.Second)
|
||||
go func(c *lru.Cache[string, *data.Item]) {
|
||||
go func(c *lru.Cache[string, *CacheItem.Item]) {
|
||||
for range ticker.C {
|
||||
keys := c.Keys()
|
||||
log.Debugf("%d items in the cache.", len(keys))
|
||||
|
|
|
@ -7,29 +7,41 @@ import (
|
|||
)
|
||||
|
||||
type Config struct {
|
||||
RootDir string
|
||||
HTTPPort string
|
||||
WatchMode string
|
||||
CrawlModeCrawlInterval int
|
||||
DirectoryCrawlers int
|
||||
WatchInterval int
|
||||
CacheSize int
|
||||
CacheTime int
|
||||
CachePrintNew bool
|
||||
CachePrintChanges bool
|
||||
InitialCrawl bool
|
||||
CacheRecacheCrawlerLimit int
|
||||
CrawlerParseMIME bool
|
||||
HttpAPIListCacheControl int
|
||||
HttpAPIDlCacheControl int
|
||||
HttpAllowDirMimeParse bool
|
||||
HttpAdminKey string
|
||||
HttpAllowDuringInitialCrawl bool
|
||||
RestrictedDownloadPaths []string
|
||||
ApiSearchMaxResults int
|
||||
ApiSearchShowChildren bool
|
||||
CrawlerChannelBufferSize int
|
||||
CrawlerMaxWorkers int
|
||||
RootDir string
|
||||
HTTPPort string
|
||||
WatchMode string
|
||||
CrawlModeCrawlInterval int
|
||||
DirectoryCrawlers int
|
||||
CrawlWorkers int
|
||||
WatchInterval int
|
||||
CacheSize int
|
||||
CacheTime int
|
||||
CachePrintNew bool
|
||||
CachePrintChanges bool
|
||||
InitialCrawl bool
|
||||
CacheRecacheCrawlerLimit int
|
||||
CrawlerParseMIME bool
|
||||
HttpAPIListCacheControl int
|
||||
HttpAPIDlCacheControl int
|
||||
HttpAllowDirMimeParse bool
|
||||
HttpAdminKey string
|
||||
HttpAllowDuringInitialCrawl bool
|
||||
RestrictedDownloadPaths []string
|
||||
ApiSearchMaxResults int
|
||||
ApiSearchShowChildren bool
|
||||
WorkersJobQueueSize int
|
||||
ElasticsearchEnable bool
|
||||
ElasticsearchEndpoint string
|
||||
ElasticsearchSyncEnable bool
|
||||
ElasticsearchSyncInterval int
|
||||
ElasticsearchFullSyncInterval int
|
||||
ElasticsearchAPIKey string
|
||||
ElasticsearchIndex string
|
||||
ElasticsearchSyncThreads int
|
||||
ElasticsearchExcludePatterns []string
|
||||
ElasticsearchAllowConcurrentSyncs bool
|
||||
ElasticsearchFullSyncOnStart bool
|
||||
ElasticsearchDefaultQueryField string
|
||||
}
|
||||
|
||||
func LoadConfig(configFile string) (*Config, error) {
|
||||
|
@ -39,6 +51,7 @@ func LoadConfig(configFile string) (*Config, error) {
|
|||
viper.SetDefault("watch_mode", "crawl")
|
||||
viper.SetDefault("crawl_mode_crawl_interval", 3600)
|
||||
viper.SetDefault("directory_crawlers", 4)
|
||||
viper.SetDefault("crawl_workers", 10)
|
||||
viper.SetDefault("cache_size", 100000000)
|
||||
viper.SetDefault("cache_time", 30)
|
||||
viper.SetDefault("cache_print_new", false)
|
||||
|
@ -53,8 +66,20 @@ func LoadConfig(configFile string) (*Config, error) {
|
|||
viper.SetDefault("api_search_max_results", 1000)
|
||||
viper.SetDefault("api_search_show_children", false)
|
||||
viper.SetDefault("http_allow_during_initial_crawl", false)
|
||||
viper.SetDefault("crawler_channel_buffer_size", 1000)
|
||||
viper.SetDefault("crawler_max_workers", 200)
|
||||
viper.SetDefault("crawler_worker_job_queue_size", 0)
|
||||
viper.SetDefault("elasticsearch_enable", false)
|
||||
viper.SetDefault("elasticsearch_endpoint", "http://localhost:9200")
|
||||
viper.SetDefault("elasticsearch_sync_enable", true)
|
||||
viper.SetDefault("elasticsearch_sync_interval", 1800)
|
||||
viper.SetDefault("elasticsearch_full_sync_interval", 86400)
|
||||
viper.SetDefault("elasticsearch_api_key", "")
|
||||
viper.SetDefault("elasticsearch_index", "crazyfs_search")
|
||||
viper.SetDefault("elasticsearch_sync_threads", 50)
|
||||
viper.SetDefault("elasticsearch_exclude_patterns", []string{".git"})
|
||||
viper.SetDefault("elasticsearch_allow_concurrent_syncs", false)
|
||||
viper.SetDefault("elasticsearch_full_sync_on_start", false)
|
||||
viper.SetDefault("elasticsearch_query_fields", []string{"extension", "name", "path", "type", "size", "isDir"})
|
||||
viper.SetDefault("elasticsearch_default_query_field", "name")
|
||||
|
||||
err := viper.ReadInConfig()
|
||||
if err != nil {
|
||||
|
@ -63,35 +88,60 @@ func LoadConfig(configFile string) (*Config, error) {
|
|||
|
||||
restrictedPaths := viper.GetStringSlice("restricted_download_paths")
|
||||
for i, path := range restrictedPaths {
|
||||
restrictedPaths[i] = strings.TrimSuffix(path, "/")
|
||||
if restrictedPaths[i] != "/" {
|
||||
restrictedPaths[i] = strings.TrimSuffix(path, "/")
|
||||
}
|
||||
}
|
||||
|
||||
rootDir := strings.TrimSuffix(viper.GetString("root_dir"), "/")
|
||||
if rootDir == "" {
|
||||
rootDir = "/"
|
||||
}
|
||||
|
||||
workersJobQueueSizeValue := viper.GetInt("crawler_worker_job_queue_size")
|
||||
var workersJobQueueSize int
|
||||
if workersJobQueueSizeValue == 0 {
|
||||
workersJobQueueSize = viper.GetInt("crawl_workers") * 100
|
||||
} else {
|
||||
workersJobQueueSize = workersJobQueueSizeValue
|
||||
}
|
||||
|
||||
config := &Config{
|
||||
RootDir: rootDir,
|
||||
HTTPPort: viper.GetString("http_port"),
|
||||
WatchMode: viper.GetString("watch_mode"),
|
||||
CrawlModeCrawlInterval: viper.GetInt("crawl_mode_crawl_interval"),
|
||||
WatchInterval: viper.GetInt("watch_interval"),
|
||||
DirectoryCrawlers: viper.GetInt("crawl_mode_crawl_interval"),
|
||||
CacheSize: viper.GetInt("cache_size"),
|
||||
CacheTime: viper.GetInt("cache_time"),
|
||||
CachePrintNew: viper.GetBool("cache_print_new"),
|
||||
CachePrintChanges: viper.GetBool("cache_print_changes"),
|
||||
InitialCrawl: viper.GetBool("initial_crawl"),
|
||||
CacheRecacheCrawlerLimit: viper.GetInt("cache_recache_crawler_limit"),
|
||||
CrawlerParseMIME: viper.GetBool("crawler_parse_mime"),
|
||||
HttpAPIListCacheControl: viper.GetInt("http_api_list_cache_control"),
|
||||
HttpAPIDlCacheControl: viper.GetInt("http_api_download_cache_control"),
|
||||
HttpAllowDirMimeParse: viper.GetBool("http_allow_dir_mime_parse"),
|
||||
HttpAdminKey: viper.GetString("api_admin_key"),
|
||||
HttpAllowDuringInitialCrawl: viper.GetBool("http_allow_during_initial_crawl"),
|
||||
RestrictedDownloadPaths: restrictedPaths,
|
||||
ApiSearchMaxResults: viper.GetInt("api_search_max_results"),
|
||||
ApiSearchShowChildren: viper.GetBool("api_search_show_children"),
|
||||
CrawlerChannelBufferSize: viper.GetInt("crawler_channel_buffer_size"),
|
||||
CrawlerMaxWorkers: viper.GetInt("crawler_worker_pool_size"),
|
||||
RootDir: rootDir,
|
||||
HTTPPort: viper.GetString("http_port"),
|
||||
WatchMode: viper.GetString("watch_mode"),
|
||||
CrawlModeCrawlInterval: viper.GetInt("crawl_mode_crawl_interval"),
|
||||
WatchInterval: viper.GetInt("watch_interval"),
|
||||
DirectoryCrawlers: viper.GetInt("crawl_mode_crawl_interval"),
|
||||
CrawlWorkers: viper.GetInt("crawl_workers"),
|
||||
CacheSize: viper.GetInt("cache_size"),
|
||||
CacheTime: viper.GetInt("cache_time"),
|
||||
CachePrintNew: viper.GetBool("cache_print_new"),
|
||||
CachePrintChanges: viper.GetBool("cache_print_changes"),
|
||||
InitialCrawl: viper.GetBool("initial_crawl"),
|
||||
CacheRecacheCrawlerLimit: viper.GetInt("cache_recache_crawler_limit"),
|
||||
CrawlerParseMIME: viper.GetBool("crawler_parse_mime"),
|
||||
HttpAPIListCacheControl: viper.GetInt("http_api_list_cache_control"),
|
||||
HttpAPIDlCacheControl: viper.GetInt("http_api_download_cache_control"),
|
||||
HttpAllowDirMimeParse: viper.GetBool("http_allow_dir_mime_parse"),
|
||||
HttpAdminKey: viper.GetString("api_admin_key"),
|
||||
HttpAllowDuringInitialCrawl: viper.GetBool("http_allow_during_initial_crawl"),
|
||||
RestrictedDownloadPaths: restrictedPaths,
|
||||
ApiSearchMaxResults: viper.GetInt("api_search_max_results"),
|
||||
ApiSearchShowChildren: viper.GetBool("api_search_show_children"),
|
||||
WorkersJobQueueSize: workersJobQueueSize,
|
||||
ElasticsearchEnable: viper.GetBool("elasticsearch_enable"),
|
||||
ElasticsearchEndpoint: viper.GetString("elasticsearch_endpoint"),
|
||||
ElasticsearchSyncEnable: viper.GetBool("elasticsearch_sync_enable"),
|
||||
ElasticsearchSyncInterval: viper.GetInt("elasticsearch_sync_interval"),
|
||||
ElasticsearchFullSyncInterval: viper.GetInt("elasticsearch_full_sync_interval"),
|
||||
ElasticsearchAPIKey: viper.GetString("elasticsearch_api_key"),
|
||||
ElasticsearchIndex: viper.GetString("elasticsearch_index"),
|
||||
ElasticsearchSyncThreads: viper.GetInt("elasticsearch_sync_threads"),
|
||||
ElasticsearchExcludePatterns: viper.GetStringSlice("elasticsearch_exclude_patterns"),
|
||||
ElasticsearchAllowConcurrentSyncs: viper.GetBool("elasticsearch_allow_concurrent_syncs"),
|
||||
ElasticsearchFullSyncOnStart: viper.GetBool("elasticsearch_full_sync_on_start"),
|
||||
ElasticsearchDefaultQueryField: viper.GetString("elasticsearch_default_query_field"),
|
||||
}
|
||||
|
||||
if config.WatchMode != "crawl" && config.WatchMode != "watch" {
|
||||
|
@ -106,8 +156,12 @@ func LoadConfig(configFile string) (*Config, error) {
|
|||
return nil, errors.New("crawl_mode_crawl_interval must be more than 1")
|
||||
}
|
||||
|
||||
if config.CrawlWorkers < 1 {
|
||||
return nil, errors.New("crawl_workers must be more than 1")
|
||||
}
|
||||
|
||||
if config.CacheSize < 1 {
|
||||
return nil, errors.New("cache_size must be more than 1")
|
||||
return nil, errors.New("crawl_workers must be more than 1")
|
||||
}
|
||||
|
||||
if config.CacheRecacheCrawlerLimit < 1 {
|
||||
|
@ -130,8 +184,8 @@ func LoadConfig(configFile string) (*Config, error) {
|
|||
return nil, errors.New("api_search_max_results must not be less than 1")
|
||||
}
|
||||
|
||||
if config.CrawlerChannelBufferSize < 1 {
|
||||
return nil, errors.New("crawler_channel_buffer_size must not be less than 1")
|
||||
if config.ElasticsearchFullSyncInterval < config.ElasticsearchSyncInterval {
|
||||
return nil, errors.New("elasticsearch_full_sync_interval must be greater than elasticsearch_sync_interval")
|
||||
}
|
||||
|
||||
return config, nil
|
||||
|
|
|
@ -0,0 +1,13 @@
|
|||
package config
|
||||
|
||||
// Config constants
|
||||
var FollowSymlinks bool
|
||||
var CachePrintNew bool
|
||||
var RootDir string
|
||||
var CrawlerParseMIME bool
|
||||
var MaxWorkers int
|
||||
var HttpAllowDuringInitialCrawl bool
|
||||
var RestrictedDownloadPaths []string
|
||||
var ElasticsearchEnable bool
|
||||
var ElasticsearchEndpoint string
|
||||
var ElasticsearchSyncInterval int
|
|
@ -1,17 +1,21 @@
|
|||
package main
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/api"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/cache/DirectoryCrawler"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"crazyfs/elastic"
|
||||
"crazyfs/logging"
|
||||
"errors"
|
||||
"flag"
|
||||
"fmt"
|
||||
"github.com/elastic/go-elasticsearch/v8"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"github.com/sirupsen/logrus"
|
||||
"net/http"
|
||||
_ "net/http/pprof"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"time"
|
||||
|
@ -20,19 +24,20 @@ import (
|
|||
var log *logrus.Logger
|
||||
var cfg *config.Config
|
||||
|
||||
var lruSize int
|
||||
|
||||
type cliConfig struct {
|
||||
configFile string
|
||||
initialCrawl bool
|
||||
debug bool
|
||||
help bool
|
||||
configFile string
|
||||
initialCrawl bool
|
||||
debug bool
|
||||
disableElasticSync bool
|
||||
help bool
|
||||
}
|
||||
|
||||
// TODO: optional serving of frontend
|
||||
// TODO: admin api to clear cache, get number of items in cache, get memory usage
|
||||
// TODO: health api endpoint that tells us if the server is still starting
|
||||
// TODO: set global http headers rather than randomly setting them in routes
|
||||
// TODO: admin api endpoint to start a full refresh of elasticsearch
|
||||
// TODO: admin api endpoint to get status and progress of the full refresh of elasticsearch
|
||||
|
||||
func main() {
|
||||
cliArgs := parseArgs()
|
||||
|
@ -73,27 +78,37 @@ func main() {
|
|||
log.Fatalf("Config file does not exist: %s", cliArgs.configFile)
|
||||
}
|
||||
|
||||
cache.FollowSymlinks = false
|
||||
|
||||
var err error
|
||||
cfg, err = config.LoadConfig(cliArgs.configFile)
|
||||
if err != nil {
|
||||
log.Fatalf("Failed to load config file: %s", err)
|
||||
}
|
||||
|
||||
// Set global constants
|
||||
cache.WorkerBufferSize = cfg.CrawlerChannelBufferSize
|
||||
cache.PrintNew = cfg.CachePrintNew
|
||||
cache.RootDir = cfg.RootDir
|
||||
cache.CrawlerParseMIME = cfg.CrawlerParseMIME
|
||||
//cache.MaxWorkers = cfg.CrawlWorkers
|
||||
cache.WorkerSemaphore = make(chan struct{}, cfg.CrawlerMaxWorkers)
|
||||
|
||||
sharedCache, err := lru.New[string, *data.Item](cfg.CacheSize)
|
||||
sharedCache, err := lru.New[string, *CacheItem.Item](cfg.CacheSize)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
// Set config variables
|
||||
// TODO: just pass the entire cfg object
|
||||
config.FollowSymlinks = false
|
||||
config.CachePrintNew = cfg.CachePrintNew
|
||||
config.RootDir = cfg.RootDir
|
||||
config.CrawlerParseMIME = cfg.CrawlerParseMIME
|
||||
config.MaxWorkers = cfg.CrawlWorkers
|
||||
config.HttpAllowDuringInitialCrawl = cfg.HttpAllowDuringInitialCrawl
|
||||
DirectoryCrawler.JobQueueSize = cfg.WorkersJobQueueSize
|
||||
config.RestrictedDownloadPaths = cfg.RestrictedDownloadPaths
|
||||
config.ElasticsearchEnable = cfg.ElasticsearchEnable
|
||||
config.ElasticsearchEndpoint = cfg.ElasticsearchEndpoint
|
||||
config.ElasticsearchSyncInterval = cfg.ElasticsearchSyncInterval
|
||||
|
||||
log.Infof("Elasticsearch enabled: %t", cfg.ElasticsearchEnable)
|
||||
|
||||
// Init global variables
|
||||
//DirectoryCrawler.CrawlWorkerPool = DirectoryCrawler.NewWorkerPool(config.MaxWorkers)
|
||||
DirectoryCrawler.WorkerPool = make(chan struct{}, config.MaxWorkers)
|
||||
|
||||
cache.InitRecacheSemaphore(cfg.CacheRecacheCrawlerLimit)
|
||||
|
||||
// Start the webserver before doing the long crawl
|
||||
|
@ -107,15 +122,13 @@ func main() {
|
|||
}()
|
||||
log.Infof("Server started on port %s", cfg.HTTPPort)
|
||||
|
||||
lruSize = cfg.CacheSize
|
||||
|
||||
if cliArgs.initialCrawl || cfg.InitialCrawl {
|
||||
log.Infoln("Preforming initial crawl...")
|
||||
start := time.Now()
|
||||
cache.InitialCrawl(sharedCache, cfg)
|
||||
duration := time.Since(start).Round(time.Second)
|
||||
keys := sharedCache.Keys()
|
||||
log.Infof("Initial crawl completed in %s. %d directories and files added to the cache.", duration, len(keys))
|
||||
log.Infof("Initial crawl completed in %s. %d items added to the cache.", duration, len(keys))
|
||||
}
|
||||
|
||||
if cfg.WatchMode == "watch" {
|
||||
|
@ -128,13 +141,41 @@ func main() {
|
|||
defer watcher.Close()
|
||||
} else if cfg.WatchMode == "crawl" {
|
||||
//log.Debugln("Starting the crawler")
|
||||
err := cache.StartCrawler(cfg.RootDir, sharedCache, cfg)
|
||||
err := cache.StartCrawler(sharedCache, cfg)
|
||||
if err != nil {
|
||||
log.Fatalf("Failed to start timed crawler process: %s", err)
|
||||
}
|
||||
log.Infoln("Started the timed crawler process")
|
||||
}
|
||||
|
||||
if cfg.ElasticsearchEnable {
|
||||
// If we fail to establish a connection to Elastic, don't kill the entire server.
|
||||
// Instead, just disable Elastic.
|
||||
|
||||
esCfg := elasticsearch.Config{
|
||||
Addresses: []string{
|
||||
cfg.ElasticsearchEndpoint,
|
||||
},
|
||||
APIKey: cfg.ElasticsearchAPIKey,
|
||||
}
|
||||
es, err := elasticsearch.NewClient(esCfg)
|
||||
if err != nil {
|
||||
log.Errorf("Error creating the Elasticsearch client: %s", err)
|
||||
elastic.LogElasticQuit()
|
||||
cfg.ElasticsearchEnable = false
|
||||
} else {
|
||||
elastic.ElasticClient = es
|
||||
|
||||
if cfg.ElasticsearchSyncEnable && !cliArgs.disableElasticSync {
|
||||
go elastic.ElasticsearchThread(sharedCache, cfg)
|
||||
log.Info("Started the background Elasticsearch sync thread.")
|
||||
} else {
|
||||
log.Info("The background Elasticsearch sync thread is disabled.")
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
select {}
|
||||
}
|
||||
|
||||
|
@ -145,6 +186,7 @@ func parseArgs() cliConfig {
|
|||
flag.BoolVar(&cliArgs.initialCrawl, "i", false, "Do an initial crawl to fill the cache")
|
||||
flag.BoolVar(&cliArgs.debug, "d", false, "Enable debug mode")
|
||||
flag.BoolVar(&cliArgs.debug, "debug", false, "Enable debug mode")
|
||||
flag.BoolVar(&cliArgs.disableElasticSync, "disable-elastic-sync", false, "Disable the Elasticsearch background sync thread")
|
||||
flag.Parse()
|
||||
return cliArgs
|
||||
}
|
||||
|
|
|
@ -1,16 +0,0 @@
|
|||
package data
|
||||
|
||||
type Item struct {
|
||||
Path string `json:"path"`
|
||||
Name string `json:"name"`
|
||||
Size int64 `json:"size"`
|
||||
Extension *string `json:"extension"`
|
||||
Modified string `json:"modified"`
|
||||
Mode uint32 `json:"mode"`
|
||||
IsDir bool `json:"isDir"`
|
||||
IsSymlink bool `json:"isSymlink"`
|
||||
Type *string `json:"type"`
|
||||
Children []*Item `json:"children"`
|
||||
Content string `json:"content,omitempty"`
|
||||
Cached int64 `json:"cached"`
|
||||
}
|
|
@ -0,0 +1,122 @@
|
|||
package elastic
|
||||
|
||||
import (
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/config"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"slices"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
func ElasticsearchThread(sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config) {
|
||||
createCrazyfsIndex(cfg)
|
||||
|
||||
// Test connection to Elastic.
|
||||
esContents, err := getPathsFromIndex(cfg)
|
||||
if err != nil {
|
||||
logElasticConnError(err)
|
||||
return
|
||||
}
|
||||
esSize := len(esContents)
|
||||
log.Infof(`ELASTIC - index "%s" contains %d items.`, cfg.ElasticsearchIndex, esSize)
|
||||
|
||||
var wg sync.WaitGroup
|
||||
sem := make(chan bool, cfg.ElasticsearchSyncThreads)
|
||||
|
||||
// Run a partial sync at startup, unless configured to run a full one.
|
||||
syncElasticsearch(sharedCache, cfg, &wg, sem, cfg.ElasticsearchFullSyncOnStart)
|
||||
|
||||
ticker := time.NewTicker(time.Duration(cfg.ElasticsearchSyncInterval) * time.Second)
|
||||
fullSyncTicker := time.NewTicker(time.Duration(cfg.ElasticsearchFullSyncInterval) * time.Second)
|
||||
|
||||
var mutex sync.Mutex
|
||||
for {
|
||||
select {
|
||||
case <-ticker.C:
|
||||
if !cfg.ElasticsearchAllowConcurrentSyncs {
|
||||
mutex.Lock()
|
||||
}
|
||||
syncElasticsearch(sharedCache, cfg, &wg, sem, false)
|
||||
if !cfg.ElasticsearchAllowConcurrentSyncs {
|
||||
mutex.Unlock()
|
||||
}
|
||||
case <-fullSyncTicker.C:
|
||||
if !cfg.ElasticsearchAllowConcurrentSyncs {
|
||||
mutex.Lock()
|
||||
}
|
||||
syncElasticsearch(sharedCache, cfg, &wg, sem, true)
|
||||
if !cfg.ElasticsearchAllowConcurrentSyncs {
|
||||
mutex.Unlock()
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func syncElasticsearch(sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config, wg *sync.WaitGroup, sem chan bool, fullSync bool) {
|
||||
var syncType string
|
||||
var esContents []string
|
||||
if fullSync {
|
||||
ElasticRefreshSyncRunning = true
|
||||
syncType = "full refresh"
|
||||
} else {
|
||||
ElasticNewSyncRunning = true
|
||||
syncType = "refresh"
|
||||
|
||||
var err error
|
||||
esContents, err = getPathsFromIndex(cfg)
|
||||
if err != nil {
|
||||
log.Errorf("ELASTIC - Failed to read the index: %s", err)
|
||||
return
|
||||
}
|
||||
}
|
||||
log.Infof("ELASTIC - starting a %s sync.", syncType)
|
||||
|
||||
start := time.Now()
|
||||
for _, key := range sharedCache.Keys() {
|
||||
wg.Add(1)
|
||||
go func(key string) {
|
||||
defer wg.Done()
|
||||
sem <- true
|
||||
cacheItem, found := sharedCache.Get(key)
|
||||
if !found {
|
||||
log.Fatalf(`ELASTICSEARCH - Could not fetch item "%s" from the LRU cache!`, key)
|
||||
} else {
|
||||
if !shouldExclude(key, cfg.ElasticsearchExcludePatterns) {
|
||||
if fullSync {
|
||||
addToElasticsearch(cacheItem, cfg)
|
||||
} else if !slices.Contains(esContents, key) {
|
||||
addToElasticsearch(cacheItem, cfg)
|
||||
}
|
||||
} else {
|
||||
deleteFromElasticsearch(key, cfg) // clean up
|
||||
//log.Debugf(`ELASTIC - skipping adding "%s"`, key)
|
||||
}
|
||||
}
|
||||
<-sem
|
||||
}(key)
|
||||
}
|
||||
wg.Wait()
|
||||
|
||||
log.Debugln("ELASTIC - Checking for removed items...")
|
||||
removeStaleItemsFromElasticsearch(sharedCache, cfg)
|
||||
|
||||
if fullSync {
|
||||
ElasticRefreshSyncRunning = false
|
||||
} else {
|
||||
ElasticNewSyncRunning = false
|
||||
}
|
||||
|
||||
duration := time.Since(start)
|
||||
log.Infof("ELASTIC - %s sync finished in %s", syncType, duration)
|
||||
}
|
||||
|
||||
func logElasticConnError(err error) {
|
||||
log.Errorf("ELASTIC - Failed to read the index: %s", err)
|
||||
LogElasticQuit()
|
||||
}
|
||||
|
||||
func LogElasticQuit() {
|
||||
log.Errorln("ELASTIC - background thread exiting, Elastic indexing and search will not be available.")
|
||||
|
||||
}
|
|
@ -0,0 +1,52 @@
|
|||
package elastic
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/config"
|
||||
"encoding/json"
|
||||
"github.com/elastic/go-elasticsearch/v8/esapi"
|
||||
)
|
||||
|
||||
func addToElasticsearch(item *CacheItem.Item, cfg *config.Config) {
|
||||
log.Debugf(`ELASTIC - Adding: "%s"`, item.Path)
|
||||
prepareCacheItem(item)
|
||||
data, err := json.Marshal(item)
|
||||
if err != nil {
|
||||
log.Printf("Error marshaling item: %s", err)
|
||||
return
|
||||
}
|
||||
req := esapi.IndexRequest{
|
||||
Index: cfg.ElasticsearchIndex,
|
||||
DocumentID: encodeToBase64(item.Path),
|
||||
Body: bytes.NewReader(data),
|
||||
Refresh: "true",
|
||||
}
|
||||
|
||||
res, err := req.Do(context.Background(), ElasticClient)
|
||||
if err != nil {
|
||||
log.Errorf("ELASTIC - Error getting response: %s", err)
|
||||
return
|
||||
}
|
||||
defer res.Body.Close()
|
||||
|
||||
if res.IsError() {
|
||||
var e map[string]interface{}
|
||||
if err := json.NewDecoder(res.Body).Decode(&e); err != nil {
|
||||
log.Printf("Error parsing the response body: %s", err)
|
||||
}
|
||||
log.Errorf(`ELASTIC - Error indexing document "%s" - Status code: %d - %s`, item.Path, res.StatusCode, e)
|
||||
}
|
||||
}
|
||||
|
||||
// prepareCacheItem is used to get an item ready to insert into Elastic.
|
||||
func prepareCacheItem(item *CacheItem.Item) {
|
||||
// We don't care about the children and this field's length may cause issues.
|
||||
item.Children = nil
|
||||
|
||||
// Length of this one also may cause issues.
|
||||
item.Content = ""
|
||||
|
||||
// Don't need to return anything since `item` is a pointer.
|
||||
}
|
|
@ -0,0 +1,74 @@
|
|||
package elastic
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crazyfs/CacheItem"
|
||||
"crazyfs/config"
|
||||
"encoding/json"
|
||||
"github.com/elastic/go-elasticsearch/v8/esapi"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
"sync"
|
||||
)
|
||||
|
||||
func removeStaleItemsFromElasticsearch(sharedCache *lru.Cache[string, *CacheItem.Item], cfg *config.Config) {
|
||||
// Retrieve all keys from Elasticsearch
|
||||
keys, err := getPathsFromIndex(cfg)
|
||||
if err != nil {
|
||||
log.Errorf("ELASTIC - Error retrieving keys from Elasticsearch: %s", err)
|
||||
return
|
||||
}
|
||||
|
||||
// Create a buffered channel as a semaphore
|
||||
sem := make(chan struct{}, cfg.ElasticsearchSyncThreads)
|
||||
|
||||
// Create a wait group to wait for all goroutines to finish
|
||||
var wg sync.WaitGroup
|
||||
|
||||
// For each key in Elasticsearch, check if it exists in the LRU cache
|
||||
for _, key := range keys {
|
||||
// Increment the wait group counter
|
||||
wg.Add(1)
|
||||
|
||||
// Acquire a semaphore
|
||||
sem <- struct{}{}
|
||||
|
||||
go func(key string) {
|
||||
// Ensure the semaphore is released and the wait group counter is decremented when the goroutine finishes
|
||||
defer func() {
|
||||
<-sem
|
||||
wg.Done()
|
||||
}()
|
||||
|
||||
if _, ok := sharedCache.Get(key); !ok {
|
||||
// If a key does not exist in the LRU cache, delete it from Elasticsearch
|
||||
deleteFromElasticsearch(key, cfg)
|
||||
log.Debugf(`ELASTIC - Removed key "%s"`, key)
|
||||
}
|
||||
}(key)
|
||||
}
|
||||
|
||||
// Wait for all goroutines to finish
|
||||
wg.Wait()
|
||||
}
|
||||
|
||||
func deleteFromElasticsearch(key string, cfg *config.Config) {
|
||||
req := esapi.DeleteRequest{
|
||||
Index: cfg.ElasticsearchIndex,
|
||||
DocumentID: encodeToBase64(key),
|
||||
}
|
||||
|
||||
res, err := req.Do(context.Background(), ElasticClient)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
defer res.Body.Close()
|
||||
|
||||
// If we tried to delete a key that doesn't exist in Elastic, it will return an error that we will ignore.
|
||||
if res.IsError() && res.StatusCode != 404 {
|
||||
var e map[string]interface{}
|
||||
if err := json.NewDecoder(res.Body).Decode(&e); err != nil {
|
||||
log.Printf("Error parsing the response body: %s", err)
|
||||
}
|
||||
log.Errorf(`ELASTIC - Error deleting document "%s" - Status code: %d - %s`, key, res.StatusCode, e)
|
||||
}
|
||||
}
|
|
@ -0,0 +1,19 @@
|
|||
package elastic
|
||||
|
||||
import (
|
||||
"crazyfs/logging"
|
||||
"github.com/elastic/go-elasticsearch/v8"
|
||||
"github.com/sirupsen/logrus"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
var ElasticClient *elasticsearch.Client
|
||||
|
||||
var ElasticNewSyncRunning bool
|
||||
var ElasticRefreshSyncRunning bool
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
ElasticNewSyncRunning = false
|
||||
ElasticRefreshSyncRunning = false
|
||||
}
|
|
@ -0,0 +1,27 @@
|
|||
package elastic
|
||||
|
||||
import (
|
||||
"encoding/base64"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func shouldExclude(path string, exclusions []string) bool {
|
||||
parts := strings.Split(path, "/")
|
||||
|
||||
// Check each part of the path to see if it's in the exclusions list.
|
||||
// This will exclude all children as well.
|
||||
for _, part := range parts {
|
||||
for _, exclusion := range exclusions {
|
||||
if part == exclusion {
|
||||
return true
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return false
|
||||
}
|
||||
|
||||
func encodeToBase64(s string) string {
|
||||
// Used to encode key names to base64 since file paths aren't very Elastic-friendly.
|
||||
return base64.RawURLEncoding.EncodeToString([]byte(s))
|
||||
}
|
|
@ -0,0 +1,31 @@
|
|||
package elastic
|
||||
|
||||
import (
|
||||
"crazyfs/config"
|
||||
)
|
||||
|
||||
func createCrazyfsIndex(cfg *config.Config) {
|
||||
// Check if index exists
|
||||
res, err := ElasticClient.Indices.Exists([]string{cfg.ElasticsearchIndex})
|
||||
if err != nil {
|
||||
log.Fatalf("Error checking if index exists: %s", err)
|
||||
}
|
||||
defer res.Body.Close()
|
||||
|
||||
// If index does not exist, create it
|
||||
if res.StatusCode == 401 {
|
||||
log.Fatalln("ELASTIC - Failed to create a new index: got code 401.")
|
||||
} else if res.StatusCode == 404 {
|
||||
res, err = ElasticClient.Indices.Create(cfg.ElasticsearchIndex)
|
||||
if err != nil {
|
||||
log.Fatalf("Error creating index: %s", err)
|
||||
}
|
||||
defer res.Body.Close()
|
||||
|
||||
if res.IsError() {
|
||||
log.Printf("Error creating index: %s", res.String())
|
||||
}
|
||||
|
||||
log.Infof(`Created a new index named "%s"`, cfg.ElasticsearchIndex)
|
||||
}
|
||||
}
|
|
@ -0,0 +1,83 @@
|
|||
package elastic
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crazyfs/config"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"github.com/elastic/go-elasticsearch/v8/esapi"
|
||||
"time"
|
||||
)
|
||||
|
||||
func getPathsFromIndex(cfg *config.Config) ([]string, error) {
|
||||
// This may take a bit if the index is very large, so avoid calling this.
|
||||
|
||||
// Print a debug message so the user doesn't think we're frozen.
|
||||
log.Debugln("Fetching indexed paths from Elasticsearch...")
|
||||
|
||||
var paths []string
|
||||
var r map[string]interface{}
|
||||
|
||||
res, err := ElasticClient.Search(
|
||||
ElasticClient.Search.WithContext(context.Background()),
|
||||
ElasticClient.Search.WithIndex(cfg.ElasticsearchIndex),
|
||||
ElasticClient.Search.WithScroll(time.Minute),
|
||||
ElasticClient.Search.WithSize(1000),
|
||||
)
|
||||
if err != nil {
|
||||
msg := fmt.Sprintf("Error getting response: %s", err)
|
||||
return nil, errors.New(msg)
|
||||
}
|
||||
defer res.Body.Close()
|
||||
|
||||
if err := json.NewDecoder(res.Body).Decode(&r); err != nil {
|
||||
msg := fmt.Sprintf("Error parsing the response body: %s", err)
|
||||
return nil, errors.New(msg)
|
||||
}
|
||||
|
||||
for {
|
||||
scrollID := r["_scroll_id"].(string)
|
||||
hits := r["hits"].(map[string]interface{})["hits"].([]interface{})
|
||||
|
||||
// Break after no more documents
|
||||
if len(hits) == 0 {
|
||||
break
|
||||
}
|
||||
|
||||
// Iterate the document "hits" returned by API call
|
||||
for _, hit := range hits {
|
||||
doc := hit.(map[string]interface{})["_source"].(map[string]interface{})
|
||||
path, ok := doc["path"].(string)
|
||||
if ok {
|
||||
paths = append(paths, path)
|
||||
}
|
||||
}
|
||||
|
||||
// Next scroll
|
||||
res, err = ElasticClient.Scroll(ElasticClient.Scroll.WithScrollID(scrollID), ElasticClient.Scroll.WithScroll(time.Minute))
|
||||
if err != nil {
|
||||
msg := fmt.Sprintf("Error getting response: %s", err)
|
||||
return nil, errors.New(msg)
|
||||
}
|
||||
defer res.Body.Close()
|
||||
|
||||
if err := json.NewDecoder(res.Body).Decode(&r); err != nil {
|
||||
msg := fmt.Sprintf("Error getting response: %s", err)
|
||||
return nil, errors.New(msg)
|
||||
}
|
||||
}
|
||||
|
||||
// Clear the scroll
|
||||
clearScrollRequest := esapi.ClearScrollRequest{
|
||||
ScrollID: []string{r["_scroll_id"].(string)},
|
||||
}
|
||||
clearScrollResponse, err := clearScrollRequest.Do(context.Background(), ElasticClient)
|
||||
if err != nil {
|
||||
msg := fmt.Sprintf("Error clearing scroll: %s", err)
|
||||
return nil, errors.New(msg)
|
||||
}
|
||||
defer clearScrollResponse.Body.Close()
|
||||
|
||||
return paths, nil
|
||||
}
|
|
@ -0,0 +1,70 @@
|
|||
package elastic
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crazyfs/config"
|
||||
"errors"
|
||||
"fmt"
|
||||
"github.com/elastic/go-elasticsearch/v8/esapi"
|
||||
"github.com/mitchellh/mapstructure"
|
||||
"strings"
|
||||
)
|
||||
|
||||
func Search(query string, exclude []string, cfg *config.Config) (*esapi.Response, error) {
|
||||
log.Debugf(`ELASTIC - Query: "%s"`, query)
|
||||
|
||||
var excludeQuery string
|
||||
if len(exclude) > 0 {
|
||||
var excludeConditions []string
|
||||
for _, e := range exclude {
|
||||
excludeConditions = append(excludeConditions, fmt.Sprintf(`{"query_string": {"query": "%s"}}`, e))
|
||||
}
|
||||
excludeQuery = fmt.Sprintf(`, "must_not": [%s]`, strings.Join(excludeConditions, ","))
|
||||
}
|
||||
|
||||
esQuery := fmt.Sprintf(`{
|
||||
"query": {
|
||||
"bool": {
|
||||
"must": {
|
||||
"simple_query_string": {
|
||||
"query": "%s",
|
||||
"default_operator": "and"
|
||||
}
|
||||
}%s
|
||||
}
|
||||
}
|
||||
}`, query, excludeQuery)
|
||||
|
||||
return ElasticClient.Search(
|
||||
ElasticClient.Search.WithContext(context.Background()),
|
||||
ElasticClient.Search.WithIndex(cfg.ElasticsearchIndex),
|
||||
ElasticClient.Search.WithBody(strings.NewReader(esQuery)),
|
||||
ElasticClient.Search.WithTrackTotalHits(true),
|
||||
ElasticClient.Search.WithPretty(),
|
||||
ElasticClient.Search.WithSize(cfg.ApiSearchMaxResults),
|
||||
)
|
||||
}
|
||||
|
||||
type ErrorReason struct {
|
||||
Reason string `mapstructure:"reason"`
|
||||
}
|
||||
|
||||
type ErrorRootCause struct {
|
||||
Causes []ErrorReason `mapstructure:"root_cause"`
|
||||
}
|
||||
|
||||
type SearchError struct {
|
||||
Error ErrorRootCause `mapstructure:"error"`
|
||||
}
|
||||
|
||||
func GetSearchFailureReason(respData map[string]interface{}) (string, error) {
|
||||
var data SearchError
|
||||
err := mapstructure.Decode(respData, &data)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
if len(data.Error.Causes) > 0 {
|
||||
return data.Error.Causes[0].Reason, nil
|
||||
}
|
||||
return "", errors.New("no root cause found")
|
||||
}
|
|
@ -0,0 +1,75 @@
|
|||
package file
|
||||
|
||||
import (
|
||||
"crazyfs/config"
|
||||
"crazyfs/logging"
|
||||
"github.com/gabriel-vasile/mimetype"
|
||||
"github.com/sirupsen/logrus"
|
||||
"mime"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
var log *logrus.Logger
|
||||
|
||||
func init() {
|
||||
log = logging.GetLogger()
|
||||
}
|
||||
|
||||
func GetMimeType(path string, analyze bool, passedInfo *os.FileInfo) (bool, string, string, error) {
|
||||
var MIME *mimetype.MIME
|
||||
var mimeType string
|
||||
var ext string
|
||||
var err error
|
||||
|
||||
var info os.FileInfo
|
||||
if config.FollowSymlinks {
|
||||
info, err = os.Lstat(path)
|
||||
} else {
|
||||
if info == nil {
|
||||
info, err = os.Stat(path)
|
||||
} else {
|
||||
info = *passedInfo
|
||||
}
|
||||
}
|
||||
|
||||
//if config.FollowSymlinks {
|
||||
// info, err = os.Stat(path)
|
||||
//} else {
|
||||
if err != nil {
|
||||
// File does not exist
|
||||
return false, "", "", err
|
||||
}
|
||||
if !info.IsDir() {
|
||||
if info.Mode()&os.ModeSymlink != 0 && !config.FollowSymlinks {
|
||||
return false, "", "", nil
|
||||
}
|
||||
ext = filepath.Ext(path)
|
||||
if analyze {
|
||||
MIME, err = mimetype.DetectFile(path)
|
||||
if err != nil {
|
||||
log.Warnf("Error analyzing MIME type: %v", err)
|
||||
return false, "", "", err
|
||||
}
|
||||
mimeType = MIME.String()
|
||||
} else {
|
||||
mimeType = mime.TypeByExtension(ext)
|
||||
}
|
||||
} else {
|
||||
return true, "", ext, nil
|
||||
}
|
||||
if strings.Contains(mimeType, ";") {
|
||||
mimeType = strings.Split(mimeType, ";")[0]
|
||||
}
|
||||
return true, mimeType, ext, nil
|
||||
}
|
||||
|
||||
func StripRootDir(path string) string {
|
||||
if path == "/" || path == config.RootDir || path == "" {
|
||||
// Avoid erasing our path
|
||||
return "/"
|
||||
} else {
|
||||
return strings.TrimSuffix(strings.TrimPrefix(path, config.RootDir), "/")
|
||||
}
|
||||
}
|
|
@ -2,7 +2,6 @@ package file
|
|||
|
||||
import (
|
||||
"bytes"
|
||||
"errors"
|
||||
"fmt"
|
||||
"github.com/chai2010/webp"
|
||||
"github.com/joway/libimagequant-go/pngquant"
|
||||
|
@ -11,71 +10,57 @@ import (
|
|||
"image/jpeg"
|
||||
"image/png"
|
||||
"io"
|
||||
"log"
|
||||
"os"
|
||||
)
|
||||
|
||||
func ConvertToPNG(filename string, contentType string) ([]byte, error) {
|
||||
imageBytes, err := os.Open(filename)
|
||||
imageFile, err := os.Open(filename)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
return nil, fmt.Errorf("failed to open file: %w", err)
|
||||
}
|
||||
defer imageBytes.Close()
|
||||
defer imageFile.Close()
|
||||
|
||||
imageBytes, err := io.ReadAll(imageFile)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to read file: %w", err)
|
||||
}
|
||||
|
||||
var img image.Image
|
||||
switch contentType {
|
||||
case "image/png":
|
||||
imageBytes, err := io.ReadAll(imageBytes)
|
||||
if err != nil {
|
||||
return nil, errors.New("unable to read png")
|
||||
}
|
||||
return imageBytes, nil
|
||||
case "image/jpeg":
|
||||
img, err := jpeg.Decode(imageBytes)
|
||||
if err != nil {
|
||||
return nil, errors.New("unable to decode jpeg")
|
||||
}
|
||||
buf := new(bytes.Buffer)
|
||||
if err := png.Encode(buf, img); err != nil {
|
||||
return nil, errors.New("unable to encode png")
|
||||
}
|
||||
return buf.Bytes(), nil
|
||||
img, err = jpeg.Decode(bytes.NewReader(imageBytes))
|
||||
case "image/webp":
|
||||
img, err := webp.Decode(imageBytes)
|
||||
if err != nil {
|
||||
return nil, errors.New("unable to decode webp")
|
||||
}
|
||||
buf := new(bytes.Buffer)
|
||||
if err := png.Encode(buf, img); err != nil {
|
||||
return nil, errors.New("unable to encode png")
|
||||
}
|
||||
img, err = webp.Decode(bytes.NewReader(imageBytes))
|
||||
case "image/gif":
|
||||
img, err := gif.Decode(imageBytes)
|
||||
if err != nil {
|
||||
return nil, errors.New("unable to decode gif")
|
||||
}
|
||||
buf := new(bytes.Buffer)
|
||||
if err := png.Encode(buf, img); err != nil {
|
||||
return nil, errors.New("unable to encode png")
|
||||
}
|
||||
return buf.Bytes(), nil
|
||||
img, err = gif.Decode(bytes.NewReader(imageBytes))
|
||||
default:
|
||||
return nil, fmt.Errorf("unsupported content type: %s", contentType)
|
||||
}
|
||||
|
||||
return nil, errors.New(fmt.Sprintf("unable to convert %#v to png", contentType))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to decode image: %w", err)
|
||||
}
|
||||
|
||||
buf := new(bytes.Buffer)
|
||||
if err := png.Encode(buf, img); err != nil {
|
||||
return nil, fmt.Errorf("failed to encode image: %w", err)
|
||||
}
|
||||
|
||||
return buf.Bytes(), nil
|
||||
}
|
||||
|
||||
func CompressPNGFile(inputImg image.Image, quality int) (*bytes.Buffer, error) {
|
||||
// Compress the image using pngquant
|
||||
compressedImg, err := pngquant.Compress(inputImg, quality, pngquant.SPEED_FASTEST)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("failed to compress image: %w", err)
|
||||
}
|
||||
|
||||
// Create a bytes.Buffer and encode the compressed image into it
|
||||
buf := new(bytes.Buffer)
|
||||
err = (&png.Encoder{CompressionLevel: png.BestCompression}).Encode(buf, compressedImg)
|
||||
//err = png.Encode(buf, compressedImg)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
return nil, fmt.Errorf("failed to encode image: %w", err)
|
||||
}
|
||||
|
||||
return buf, nil
|
||||
|
|
|
@ -0,0 +1,71 @@
|
|||
package file
|
||||
|
||||
import (
|
||||
"crazyfs/config"
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// SafeJoin Clean the provided path
|
||||
func SafeJoin(pathArg string) (string, error) {
|
||||
cleanPath := filepath.Join(config.RootDir, filepath.Clean(pathArg))
|
||||
cleanPath = strings.TrimRight(cleanPath, "/")
|
||||
return cleanPath, nil
|
||||
}
|
||||
|
||||
func DetectTraversal(pathArg string) (bool, error) {
|
||||
// Remove the trailing slash so our checks always handle the same format
|
||||
if pathArg != "/" {
|
||||
pathArg = strings.TrimRight(pathArg, "/")
|
||||
}
|
||||
|
||||
// If the path starts with "~", a directory traversal attack is being attempted
|
||||
if strings.HasPrefix(pathArg, "~") {
|
||||
return true, fmt.Errorf("includes home directory: %s", pathArg)
|
||||
}
|
||||
|
||||
// The file path should ALWAYS be absolute.
|
||||
// For example: /Documents
|
||||
if !filepath.IsAbs(pathArg) {
|
||||
return true, fmt.Errorf("is not absolute path: %s", pathArg)
|
||||
}
|
||||
|
||||
cleanArg := filepath.Clean(pathArg)
|
||||
cleanPath := filepath.Join(config.RootDir, cleanArg)
|
||||
|
||||
// If the path is not within the base path, return an error
|
||||
if !strings.HasPrefix(cleanPath, config.RootDir) {
|
||||
return true, fmt.Errorf("the full path is outside the root dir: %s", pathArg)
|
||||
}
|
||||
|
||||
// If the cleaned path is not the same as the original path, a directory traversal attack is being attempted
|
||||
if pathArg != cleanArg {
|
||||
return true, fmt.Errorf("path. Clean modified the path arg from %s to %s", pathArg, cleanArg)
|
||||
}
|
||||
|
||||
return false, nil
|
||||
}
|
||||
|
||||
func PathExists(path string) (bool, error) {
|
||||
fileInfo, err := os.Lstat(path)
|
||||
if err != nil {
|
||||
if os.IsNotExist(err) {
|
||||
return false, nil // File or symlink does not exist
|
||||
}
|
||||
return false, err // Other error
|
||||
}
|
||||
|
||||
if fileInfo.Mode()&os.ModeSymlink != 0 {
|
||||
_, err := os.Stat(path)
|
||||
if err != nil {
|
||||
if os.IsNotExist(err) {
|
||||
return false, nil // Symlink is broken
|
||||
}
|
||||
return false, err // Other error
|
||||
}
|
||||
}
|
||||
|
||||
return true, nil // File or symlink exists and is not broken
|
||||
}
|
|
@ -1,207 +0,0 @@
|
|||
package file
|
||||
|
||||
import (
|
||||
"archive/zip"
|
||||
"compress/flate"
|
||||
"crazyfs/api/helpers"
|
||||
"crazyfs/cache"
|
||||
"crazyfs/config"
|
||||
"crazyfs/data"
|
||||
"encoding/json"
|
||||
lru "github.com/hashicorp/golang-lru/v2"
|
||||
kzip "github.com/klauspost/compress/zip"
|
||||
"io"
|
||||
"net/http"
|
||||
"os"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func ZipHandler(dirPath string, w http.ResponseWriter, r *http.Request, compressionLevel int) {
|
||||
// The compressionLevel parameter should be a value between -2 and 9 inclusive, where -2 means default compression, 1 means best speed, and 9 means best compression.
|
||||
// Set to 0 to disable compression (store mode)
|
||||
|
||||
// You need to write the headers and status code before any bytes
|
||||
w.Header().Set("Content-Type", "application/zip")
|
||||
// the filename which will be suggested in the save file dialog
|
||||
w.WriteHeader(http.StatusOK)
|
||||
|
||||
zipWriter := zip.NewWriter(w)
|
||||
|
||||
// Set the compression level
|
||||
if compressionLevel > 0 {
|
||||
zipWriter.RegisterCompressor(zip.Deflate, func(out io.Writer) (io.WriteCloser, error) {
|
||||
return flate.NewWriter(out, compressionLevel)
|
||||
})
|
||||
}
|
||||
|
||||
// Walk through the directory and add each file to the zip
|
||||
filepath.Walk(dirPath, func(filePath string, info os.FileInfo, err error) error {
|
||||
if info.IsDir() {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Ensure the file path is relative to the directory being zipped
|
||||
relativePath, err := filepath.Rel(dirPath, filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
header, err := zip.FileInfoHeader(info)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
header.Name = relativePath
|
||||
|
||||
if compressionLevel > 0 {
|
||||
header.Method = zip.Deflate
|
||||
} else {
|
||||
header.Method = zip.Store
|
||||
}
|
||||
|
||||
writer, err := zipWriter.CreateHeader(header)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
file, err := os.Open(filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
_, err = io.Copy(writer, file)
|
||||
return err
|
||||
})
|
||||
|
||||
err := zipWriter.Close()
|
||||
if err != nil {
|
||||
http.Error(w, err.Error(), http.StatusInternalServerError)
|
||||
}
|
||||
}
|
||||
|
||||
func ZipHandlerCompress(dirPath string, w http.ResponseWriter, r *http.Request) {
|
||||
w.Header().Set("Content-Type", "application/zip")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
|
||||
zipWriter := kzip.NewWriter(w)
|
||||
// Walk through the directory and add each file to the zip
|
||||
filepath.Walk(dirPath, func(filePath string, info os.FileInfo, err error) error {
|
||||
if info.IsDir() {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Ensure the file path is relative to the directory being zipped
|
||||
relativePath, err := filepath.Rel(dirPath, filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
writer, err := zipWriter.Create(relativePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
file, err := os.Open(filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
_, err = io.Copy(writer, file)
|
||||
return err
|
||||
})
|
||||
|
||||
err := zipWriter.Close()
|
||||
if err != nil {
|
||||
http.Error(w, err.Error(), http.StatusInternalServerError)
|
||||
}
|
||||
}
|
||||
func ZipHandlerCompressMultiple(paths []string, w http.ResponseWriter, r *http.Request, cfg *config.Config, sharedCache *lru.Cache[string, *data.Item]) {
|
||||
zipWriter := kzip.NewWriter(w)
|
||||
// Walk through each file and add it to the zip
|
||||
for _, path := range paths {
|
||||
relPath := cache.StripRootDir(filepath.Join(cfg.RootDir, path), cfg.RootDir)
|
||||
fullPath := filepath.Join(cfg.RootDir, relPath)
|
||||
|
||||
// Check if the path is in the restricted download paths
|
||||
for _, restrictedPath := range cfg.RestrictedDownloadPaths {
|
||||
if relPath == restrictedPath {
|
||||
w.Header().Set("Content-Type", "application/json")
|
||||
w.WriteHeader(http.StatusForbidden)
|
||||
json.NewEncoder(w).Encode(map[string]interface{}{
|
||||
"code": http.StatusForbidden,
|
||||
"error": "not allowed to download this path",
|
||||
})
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
// Try to get the data from the cache
|
||||
item, found := sharedCache.Get(relPath)
|
||||
if !found {
|
||||
item = helpers.HandleFileNotFound(relPath, fullPath, sharedCache, cfg, w)
|
||||
}
|
||||
if item == nil {
|
||||
// The errors have already been handled in handleFileNotFound() so we're good to just exit
|
||||
return
|
||||
}
|
||||
|
||||
if !item.IsDir {
|
||||
writer, err := zipWriter.Create(relPath)
|
||||
if err != nil {
|
||||
http.Error(w, err.Error(), http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
|
||||
file, err := os.Open(fullPath)
|
||||
if err != nil {
|
||||
http.Error(w, err.Error(), http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
_, err = io.Copy(writer, file)
|
||||
if err != nil {
|
||||
http.Error(w, err.Error(), http.StatusInternalServerError)
|
||||
return
|
||||
}
|
||||
} else {
|
||||
w.Header().Set("Content-Disposition", `attachment; filename="files.zip"`)
|
||||
w.Header().Set("Content-Type", "application/zip")
|
||||
w.WriteHeader(http.StatusOK)
|
||||
|
||||
// If it's a directory, walk through it and add each file to the zip
|
||||
filepath.Walk(fullPath, func(filePath string, info os.FileInfo, err error) error {
|
||||
if info.IsDir() {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Ensure the file path is relative to the directory being zipped
|
||||
relativePath, err := filepath.Rel(fullPath, filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
writer, err := zipWriter.Create(filepath.Join(relPath, relativePath))
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
file, err := os.Open(filePath)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
_, err = io.Copy(writer, file)
|
||||
return err
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
err := zipWriter.Close()
|
||||
if err != nil {
|
||||
http.Error(w, err.Error(), http.StatusInternalServerError)
|
||||
}
|
||||
}
|
|
@ -5,11 +5,13 @@ go 1.20
|
|||
require (
|
||||
github.com/chai2010/webp v1.1.1
|
||||
github.com/disintegration/imaging v1.6.2
|
||||
github.com/elastic/go-elasticsearch/v8 v8.11.1
|
||||
github.com/gabriel-vasile/mimetype v1.4.2
|
||||
github.com/gorilla/mux v1.8.0
|
||||
github.com/hashicorp/golang-lru/v2 v2.0.4
|
||||
github.com/joway/libimagequant-go v0.1.0
|
||||
github.com/klauspost/compress v1.16.7
|
||||
github.com/mitchellh/mapstructure v1.5.0
|
||||
github.com/nfnt/resize v0.0.0-20180221191011-83c6a9932646
|
||||
github.com/radovskyb/watcher v1.0.7
|
||||
github.com/sirupsen/logrus v1.9.3
|
||||
|
@ -17,10 +19,10 @@ require (
|
|||
)
|
||||
|
||||
require (
|
||||
github.com/elastic/elastic-transport-go/v8 v8.3.0 // indirect
|
||||
github.com/fsnotify/fsnotify v1.6.0 // indirect
|
||||
github.com/hashicorp/hcl v1.0.0 // indirect
|
||||
github.com/magiconair/properties v1.8.7 // indirect
|
||||
github.com/mitchellh/mapstructure v1.5.0 // indirect
|
||||
github.com/pelletier/go-toml/v2 v2.0.8 // indirect
|
||||
github.com/pkg/errors v0.9.1 // indirect
|
||||
github.com/spf13/afero v1.9.5 // indirect
|
||||
|
@ -30,7 +32,7 @@ require (
|
|||
github.com/subosito/gotenv v1.4.2 // indirect
|
||||
golang.org/x/image v0.0.0-20211028202545-6944b10bf410 // indirect
|
||||
golang.org/x/net v0.10.0 // indirect
|
||||
golang.org/x/sys v0.8.0 // indirect
|
||||
golang.org/x/sys v0.10.0 // indirect
|
||||
golang.org/x/text v0.9.0 // indirect
|
||||
gopkg.in/ini.v1 v1.67.0 // indirect
|
||||
gopkg.in/yaml.v3 v3.0.1 // indirect
|
||||
|
|
|
@ -53,6 +53,10 @@ github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c
|
|||
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/disintegration/imaging v1.6.2 h1:w1LecBlG2Lnp8B3jk5zSuNqd7b4DXhcjwek1ei82L+c=
|
||||
github.com/disintegration/imaging v1.6.2/go.mod h1:44/5580QXChDfwIclfc/PCwrr44amcmDAg8hxG0Ewe4=
|
||||
github.com/elastic/elastic-transport-go/v8 v8.3.0 h1:DJGxovyQLXGr62e9nDMPSxRyWION0Bh6d9eCFBriiHo=
|
||||
github.com/elastic/elastic-transport-go/v8 v8.3.0/go.mod h1:87Tcz8IVNe6rVSLdBux1o/PEItLtyabHU3naC7IoqKI=
|
||||
github.com/elastic/go-elasticsearch/v8 v8.11.1 h1:1VgTgUTbpqQZ4uE+cPjkOvy/8aw1ZvKcU0ZUE5Cn1mc=
|
||||
github.com/elastic/go-elasticsearch/v8 v8.11.1/go.mod h1:GU1BJHO7WeamP7UhuElYwzzHtvf9SDmeVpSSy9+o6Qg=
|
||||
github.com/envoyproxy/go-control-plane v0.9.0/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymFceY/DCBVvsKhRF0iEA4=
|
||||
github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473/go.mod h1:YTl/9mNaCwkRvm6d1a2C3ymFceY/DCBVvsKhRF0iEA4=
|
||||
github.com/envoyproxy/go-control-plane v0.9.4/go.mod h1:6rpuAdCZL397s3pYoYcLgu1mIlRU8Am5FuJP05cCM98=
|
||||
|
@ -334,8 +338,8 @@ golang.org/x/sys v0.0.0-20210423185535-09eb48e85fd7/go.mod h1:h1NjWce9XRLGQEsW7w
|
|||
golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20220908164124-27713097b956/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.8.0 h1:EBmGv8NaZBZTWvrbjNoL6HVt+IVy3QDQpJs7VRIw3tU=
|
||||
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.10.0 h1:SqMFp9UcQJZa+pmYuAKjd9xq1f0j5rLcDIk0mj4qAsA=
|
||||
golang.org/x/sys v0.10.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
|
||||
golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
|
||||
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
|
||||
|
|
|
@ -0,0 +1,5 @@
|
|||
- Add a wildcard option to restricted_download_paths to block all sub-directories
|
||||
- Add a dict to each restricted_download_paths item to specify how many levels recursive the block should be applied
|
||||
- Add an endpoint to return restricted_download_paths so the frontend can block downloads for those folders
|
||||
- Load the config into a global variable and stop passing it as function args
|
||||
- Remove the file change watcher mode
|
Loading…
Reference in New Issue