Is it possible, using KBM, to analyze text files and report the number of times each found common noun is used?
Of course. Although I'm not sure if your questions means to exclude actions such as "Execute Shell Script." Here's how I would do it:
You place your file in the first box. Then the 'tr' command breaks every space into a newline. Then the grep command returns the count of the number of lines that contain the word Maestro.
There are some ways to fine-tune this. For example, it may not see "Maestro+Maestro" as two counts of the same word, because there is no space between these words. So my solution is simply a point for discussion and possible improvement. My solution does, however, account for case differences, like "maestro" and "Maestro."
If you put the "word" you want into a variable, then I think you can simply replace Maestro above with $KMVAR_YourVariableName
Thanx Airy. However, I am not looking for a particular word, just the top most used common nouns whatever they may be and I do not have a list of those words.
So I thought perhaps I could use KBM to control flow and Applescript to count and report back to KBM the findings, i.e. the three most used common nouns in each file that get analyzed.
I am trying to get the Applescript to function properly, but I am having trouble getting it to compile. Here's what I have so far, but it is returning errors and won't compile.
use framework "Foundation"
property directoryPath : "/Path/To/TextFiles"
tell application "Finder"
set fileList to selection as list
end tell
repeat with eachFile in fileList
set fileContents to (read text from file eachFile)
set commonNouns to {}
-- Extract common nouns using a regular expression
set commonNounPattern to "(?<!\S)[A-Z][a-z]*(?=\s|$)"
set commonNouns to (every match of commonNounPattern in fileContents)
-- Count common noun occurrences
set commonNounCounts to {}
repeat with eachNoun in commonNouns
set nounCount to (value of key eachNoun in commonNounCounts)
if nounCount is equal to missing value then
set nounCount to 1
else
set nounCount to nounCount + 1
end if
set value of key eachNoun in commonNounCounts to nounCount
end repeat
-- Sort by count
set sortedCounts to (items of commonNounCounts) as list
sort sortedCounts using {key: "value", ascending: false}
-- Extract top three common nouns
set topThreeNouns to (items 1 through 3 of sortedCounts)
set topThreeNounsText to (text items of topThreeNouns as list, using ", ")
-- Display results
display topThreeNounsText as "Top 3 common nouns in " & name of eachFile
end repeat
I can't give advice about AppleScript, but I think I have a much simpler way to solve your problem. If you run this single KM action, you will see a list of the five most common words in your last message to me, with their frequencies. Shell tools are very mature and powerful. Maybe you'll be satisfied with this, or maybe you won't.
Count Words Macro (v11.0.1)
Count Words.kmmacros (2.6 KB)
If you like this solution, you may want to modify it to account for things like punctuation, capitalization, etc. This is just a very simple solution that may not take everything you want into account. It can be improved.
Thanx, Airy... I'll give that a try!
Perhaps also worth looking at DEVONthink for that kind of thing ?
Do a search here for Text Toolbox. It does that and more.