Human Search Engine
Human Search Engine is a person or persons
searching for knowledge,
information or
answers
on the internet, using
multiple search engines and
multiple resources and multiple websites, and using multiple keywords
and phrases in order to locate relevant information. A human search engine
may also use other media sources such as television, movies, documentaries,
radio, magazines, newspapers, advertisers, as well as recommendations
from other people. The main goal of a human
search engine is to find
relevant information and useful links to websites that pertain to a particular subject.
A human search engine is indexed by human
eyes and not by
algorithms.
Internet Mining
is the application of data mining techniques that are used to discover
patterns
in the World Wide Web.
Web mining can be divided into three
different types. Web
usage mining,
web content mining and web
structure
mining.
Search Engine Types -
Search Engine Flaws -
Search
Technology
Web Crawler is a
meta-search engine that blends the
top search results.
Web Robot is a software application that runs
automated tasks or scripts over the Internet.
Aggregate is to
form and
gather separate units into a
mass or whole.
Archivist
is an information professional who assesses, collects,
organizes, preserves, maintains control over, and provides access to
records and archives determined to have long-term value.
Filtering -
Defragging
-
Curation
-
Organizing Wiki Pages
Scribe
is a person who serves as a professional copyist, especially one who made
copies of manuscripts before the invention of automatic printing. The
profession of the scribe, previously widespread across cultures, lost most
of its prominence and status with the advent of the printing press. The
work of scribes can involve copying manuscripts and other texts as well as
secretarial and administrative duties such as the taking of dictation and
keeping of business, judicial, and historical records for kings, nobles,
temples, and cities. The profession has developed into public servants,
journalists, accountants, bookkeepers, typists, and lawyers. In societies
with low literacy rates, street-corner letter-writers (and readers) may
still be found providing scribe service.
World Brain
is a world
encyclopedia that could help
world citizens make the
best use of universal information resources and make the best contribution
to world peace.
Ontology is a
knowledge domain that
is usually
hierarchical and contains all the relevant entities
and their
relations.
Ontology is the
philosophical study of the nature of being, becoming, existence, or
reality, as well as the basic categories of being and their relations.
Ontology in
information science is a formal naming and definition of
the types, properties, and interrelationships of the entities that really
exist in a particular domain of discourse. Thus, it is basically a
taxonomy.
Vannevar Bush envisioned the internet before modern
computers were being used.
Mundaneum is a non profit organization based in Mons,
Belgium that runs an exhibition space, website and archive which celebrate
the legacy of the original Mundaneum established by Paul Otlet and Henri
La Fontaine in the early twentieth century.
Feltron.
I Am a Human Search Engine
I am a Human Search Engine, but I'm much more than that. I'm
on a quest to understand the meaning of
human intelligence. I'm also involved in the never ending process of finding ways to
improve education, as well
as understanding how
the
public is informed about the
realities of our world
and
our current situation.
"The collector is the true resident of the interior. The
collector dreams his way not only into a distant or bygone world,
but also
into a better one" -
Walter Benjamin.
I'm an internet
miner
exploring the
world wide web. I'm
an
archivist
of information and
knowledge.
I'm extracting and aggregating the most valuable
information and
collecting the most informative websites
that the internet and the world has to offer.
I'm an
information architect
who is
filtering and
organizing the
internet one website at a time. I'm a knowledge moderator, an internet scribe,
and an
intelligent agent, like
Ai. But
it's
more than that. I'm an accumulator of knowledge who seeks to pass knowledge on to
others. Lowering the
entropy of the system since 2008.
Web Portal is
a specially designed web site that brings information together
from diverse sources in a uniform way.
Extract, Transform, Load is a process in database usage and
especially in data warehousing that
extracts
data from homogeneous or heterogeneous data sources. Transforms
the data for storing it in the proper format or structure for
the purposes of
querying and analysis. Loads it into the final target such
as a
database or operational data store, data mart, or data
warehouse.
Welcome to my Journey in
hyperlink heaven. Over
22 years of internet searches that
are organized, categorized and contextualized. A
researchers dream. I have already clicked my
mouse over a 10 million times, and I've only just begun.
I have tracked over 90% of my online
activities since 1998, so my
digital trail is a long one. This
is my story about one mans journey
through the Internet.
What if you shared everything you learned?
Did you ever wonder?
To put it simply
I'm Organizing the Internet.
Over the last 22
years since 1998, I have been surfing the world wide web, or
trail blazing the internet, and
curating
my experience. I've asked
the internet well over 500,000 Questions so far. And from those
questions I have gathered a lot of Information, Knowledge and
Resources. So I then organized this Information, Knowledge and
Resources into
categories. I then published it on my website so that
the Information, Knowledge and Resources can be shared and used
for educational purposes. I also share what I've personally
learned from this incredible endless journey that I have taken
through the internet. The internet is like the universe, I'm not
over whelmed by the
size of the internet, I'm just
amazed from all the things that I have learned, and wondering
just how much more will I be able to understand. Does knowledge
and information have a limit? Well lets find out.
Adventure for
me has always been about discovering limits, this is just
another
adventure. I'm an internet
surfer who has been riding the perfect wave for over 12 years.
But this is nothing new. In the early 1900's,
Paul Otlet pursued his quest to organize the world’s
information.
A human
search engine
is someone who is not manipulated by money or manipulated by defective and
ineffective algorithms.
A human search engine is
created by humans
and is a service for humans. People want what's
important. People want the most valuable knowledge and
information that is available, without stupid adds, and without
any ignorant manipulation or
censorship. People want a trusted source for information, a
source that cares about people more than money. We don't
have everything, but who needs everything?
I'm a
pilgrim on a
pilgrimage.
I'm an internet pathfinder whose task it is to carry out
daily internet reconnaissance missions and document my
findings. I'm not an
internet
guru or a
gatekeeper, but I have created an excellent
internet resource.
Our physical journeys in the world are just as important as our
mental explorations in the
mind, the
discoveries are endless.
These days I seem to be leaving more
digital footprints than actual footprints,
which seems more meaningful in this day and age.
Quest is
the act of searching for something, searching for
an
alternative that meets your needs.
Quest is a difficult
journey towards a
goal, often symbolic, abstract in
idea or
metaphor. An
adventure.
I'm more of a
knowledge
organizer and Knowledge Sharer than a
knowledge keeper. I
also wouldn't say that I'm
a wisdom keeper, I am more of a wisdom sharer, which makes everyone a
wisdom beneficiary.
I am just a bee in the hive of Knowledge, doing my part to keep the hive
productive.
Beehive is an enclosed structure in which some honey
bee
species of the subgenus Apis live and raise their young.
Knowledge Hive
-
Knowledge Hives
-
The Hive
Knowledge Platform (youtube)
Honeycomb is a mass of
hexagonal prismatic wax cells built by honey bees in their nests to
contain their larvae and stores of honey and pollen.
Polyhedron.
“
For every minute spent in organizing
is an hour is earned.”
I feel like a
human conduit, a passage, a pipe, a tunnel or a channel for
transferring information and synchronizing information to and from various
destinations.
Two Directory
Projects are the work accumulated from one Human Editor -
The Power of One (youtube)
Looking for Adventure.com has over
60,000 handpicked Websites. (External Links) - LFA took
14 years to accumulate as of 2016.
Basic
Knowledge 101.com has over
50,000 handpicked Websites. (External Links) - Took 8 years
to accumulate as of 2016.
The Internet and Computer Digital
Information combined allows a person to save the work that they have done
and create a living record of information and experiences.
Example '
Looking for Adventure.com ' "not a total copy of my
life but getting close". Things don't have to be
written in stone
anymore, but it doesn't hurt to have an
extra copy.
When I started
in 1998 I didn't know how much knowledge and information I would find,
or did I know what kind of knowledge and information I would
find, or did I know what kind of benefits would come from this
knowledge and information. Like a miner in the olds days, you
dig a little each day and see what you get. And wouldn't you
know it, I hit the jackpot. The wealth of information and
knowledge that there is in the world is enormous, and
invaluable. But we can't celebrate just yet, we still
need to distribute our wealth of knowledge and information and
give everyone access. Other wise we will never fully benefit
from our wealth of knowledge and information, or we will ever
fully benefit from the enormous potential that it will give us.
"I saw a huge unexplored
ocean, so naturally I dove in to take a look. 8 years later in
2016, I have been exploring this endless sea of knowledge, and
have come to realize that I have found a home."
About my Research.
What have I Learned about being a Human Search Engine
I am a
semantic web as well as a Human Search Engine. Humans will
always be better than machines when it comes to associations,
perceptions,
perspectives,
categorizing and
organizing, something's need to be done manually. Especially
when it comes to organizing information and knowledge.
Linking
data,
ontology learning,
library and information science, creating a
Visual Thesaurus and
tag clouds is what I have been doing for 10 years. " Welcome
to Web 3.0." I'm an
intelligent agent combining
logic and
fuzzy logic, because there are just some things that
machines or
Artificial Intelligence cannot
do or do well.
Automated
reasoning systems and
computational logic can only do so much. So we need more
intelligent humans than computer
algorithms.
Creating knowledge bases is absolutely essential. This is why I believe
that having more Human Search Engines is a benefit to anyone seeking
knowledge and information. Structuring websites into syntax
link patterns and information into categories or
taxonomies without being objective or impartial.
Organizing information
and websites so that visitors have an easy time finding what they're
looking for, plus at the same time, showing them other things that are
related to that particular subject that might also be of interest to them. More relevant choices and a great alternative and
complement to
search engines. But it's not
easy to manage and maintain a human search engine, especially
for one person. You're constantly updating the link data base,
adding links, replacing links or removing some links altogether.
Then on top of that there's the organizing and the adding of
content, photos and video. And all the while your website grows
and grows. Adding related subjects and subcategorizing
information and links. Cross linking or
cross-referencing so that
related information can be found in more then one place while at
the same time displaying more
connections and more
associations.
Interconnectedness
-
Human
Based Genetic Algorithms -
Principle
of Least Effort -
Abstraction -
Relational Model.
What being a Human Search Engine Represents
A Human Search Engine is more then just a
website with
hyperlinking, and it's more then just
an Information hub or
a
node with
contextual information and
structured grouping.
A Human Search Engine is also more than just
knowledge organization, it's a branch of
library and
information science concerned with activities such as document
description, indexing and
classification performed in libraries,
databases, archives, etc..
Intelligence Gathering is a method by which a country
gathers information using non-governmental employees.
Internet Aggregation refers to a web site or computer
software that
aggregates a specific type of information from
multiple online
sources.
Knowledge Extraction is the creation of
knowledge from
structured
relational databases
or XML, and unstructured text, documents,
images and sources. The resulting knowledge needs to be in a
machine-readable
and machine-interpretable format and must represent knowledge in a manner
that facilitates
inferencing.
Information Extraction -
Information Filtering System
Glean is to extract information from
various sources. Gather, as of natural products. Accumulate
resources.
Database Indexing -
File System -
Knowledge Base -
Knowledge Management
Media
Curation -
Digital
Curation -
Documentation
Master
Directory is a file system cataloging structure which
contains references to other computer files, and possibly other
directories. On many computers,
directories are known as folders, or
drawers to provide some relevancy to a workbench or the traditional office
file cabinet.
Web Directory is a directory on the World Wide Web. A
collection of data
organized
into categories. It specializes in linking to other web sites and
categorizing those links.
Web of Knowledge.
Website Library -
Types of Books
Web indexing refers to various methods for
indexing the
contents of a website or of the Internet as a whole. Individual websites
or intranets may use a back-of-the-book index, while search engines
usually use keywords and metadata to provide a more useful vocabulary for
Internet or onsite searching. With the increase in the number of
periodicals that have articles online, web indexing is also becoming
important for periodical websites.
Web Index.
Semantic Web is an extension of the Web through standards by the World
Wide Web Consortium (W3C). The standards promote common data formats and
exchange protocols on the Web, most fundamentally the Resource Description
Framework (RDF). The Semantic Web provides a common framework that
allows data to be shared and reused
across application, enterprise, and community boundaries. The Semantic Web
is therefore regarded as an integrator across different content,
information applications and systems.
The goal of the Semantic Web is to make Internet data machine-readable.
Semantic Web Info.
Machine-Readable Data is data in a format that can be
processed by a computer. Machine-readable data must be structured data
and in a format that can be easily processed by a computer without human
intervention while ensuring no semantic meaning is lost. Machine readable
is not synonymous with
digitally
accessible. A digitally accessible document may be online, making it
easier for humans to access via computers, but its content is much harder
to extract, transform, and process via computer programming logic if it is
not machine-readable.
Extensible
Markup Language (XML) is designed to be both human- and
machine-readable, and
Extensible
Style Sheet Language Transformation (XSLT) is used to improve
presentation of the data for human readability. For example, XSLT can be
used to automatically render XML in
Portable Document Format (PDF). Machine-readable data can be
automatically transformed for human-readability but, generally speaking,
the reverse is not true.
Academics - My Fundamental Contribution
In a way my work as a Human Search Engine is my
dissertation.
My
thesis is
Basic Knowledge 101 and proving
the importance of a
Human Operating System in
regards to having a more comprehensive and effective education. This is my
tenure. My Education Knowledge Database Project. This
is just the beginning of my
intellectual works.
Basic Knowledge 101.com is my
curriculum vitae. Working on this project I went from an
undergraduate
study, through
postgraduate
education right into a
graduate program.
I started out as a non-degree seeking student but I ended up
with a
master's degree and a
doctoral degree, well almost.
I have done my fieldwork, I have acquired specialized skills, I
have done advanced original
research. But I still have no name for my
Advanced Academic Degree. Maybe "
Internet Comprehension 101".
My Business Card -
HyperLand
(youtube)
Academic Tenure is defending the principle of academic freedom, which
holds that it is beneficial for society in the long run if scholars are
free to hold and examine a variety of views.
Tenure is to give someone a permanent post, especially as a teacher
or professor.
Internet Studies
is an
interdisciplinary field studying the
social, psychological, pedagogical, political, technical, cultural,
artistic, and other dimensions of the Internet and associated information
and communication technologies.
Internet and
society is a research field that addresses the interrelationship of
Internet and
society, i.e. how
society has changed the Internet and how the Internet has changed society.
Information Science
-
Peer-to-Peer -
Open Source -
Free Open Access.
How Can One Person Create Databases this Large in Such a
Short Time
The techniques and methods are quite simple when you're using
the
Internet. You literally have
thousands upon thousands of smart people indirectly doing a tremendous
amount of work for you. This gives individuals the power and the ability
to solve almost any problem. Sharing information and knowledge on a
platform that millions of people can have access to has transformed our
existence in so many ways that people cannot even comprehend the changes
that are happening now or have happened already and will most likely
happen in the future.
First Step: When doing internet searches, for what ever
reason, you are bound to come across a website or keyword phrase
that relates to your subject matter. Then you do more searches
using those keywords and then save those keywords and websites
to your database. This is very important because most likely you
will never come across the same info related to those particular
search parameters, so saving and documenting your findings is
very important.
Terminology Extraction.
Second Step: When reading, watching TV, watching a movie or even
talking with someone, you are bound to come across ideas and
keywords that you could use when searching for more information
pertaining to your subject. Then again saving and documenting
your findings is very important. It's always a good idea to have
a pen and paper handy to write things down or you can use your
cell phone to record a voice memo so that you don't forget your
information or ideas. The main thing is to have a subject that
you're interested in and at the same time being aware of what
information is valuable to your subject when it finally presents
itself. Combining a
human algorithm with a
randomized
algorithm.
Third Step: Organizing, updating and improving your database
so that it stays functional and easy to access. So my time is
usually balanced between these three tasks, and yes it is time
consuming. You can also use the
Big 6 Techniques when gathering
Information to help with your efficiency and effectiveness.
I also created a
Internet Searching Tips help Section for useful ideas.
Glossary.
One Last Thing: If you spend a lot of time on the internet
doing searches and looking for answers you are bound to come
across some really useful websites and information that were not
relevant to what you were originally searching for. So it's a
good idea to start saving these useful websites in new
categories or just save them in a appropriate named folder in
your documents. This way you can share these websites with
friends or just
use them at some later time.
It is sometimes called Creating Search
Trails, which I have 21 years worth as of 2019. Not bad for a
Personal Web Page.
Visible Web - World Wide Web - Dark Web
I have always used the world wide
web or the
surface web for my work. And I have always had a good
connection,
but not a totally secure
connection. So just in case you need to search the web when the main
stream web becomes too risky, you should now about your alternatives.
World
Wide Web is an information space of
networked computers where
documents and other
web resources are identified by
Uniform
Resource Locators interlinked by
hypertext
links, and can be accessed via the
Internet.
URL is a website address.
Surface Web is that portion of the
World
Wide Web that is readily available to the general public and
searchable with standard web search engines. It is the opposite of the deep web.
The surface web is also called the Visible Web, Clearnet,
Indexed
Web, Indexable Web or Lightnet.
The Deep Web consists of those pages that Google and other
search engines don't index.
The Deep Web is about 500 times larger than the Visible Web, but
the Visible Web is much easier to access.
Deep Web (wiki).
The Dark Web is an actively hidden, often
anonymous part of the
deep web but it isn't inherently bad.
Dark Internet (wiki).
Deep Web Exploring the
Dark Internet, the part of the internet that very little
people have ever seen.
Memex (wiki).
How the Mysterious Dark Net is going Mainstream (video) -
Tor Project.
Google has indexed 1 trillion pages so far in 2016, but that is
only 5% of the total knowledge and information that we have.
Filtering - Gatekeeping
Information Filtering System is a
system that removes
redundant or
unwanted information from an information stream, using semi
automated or computerized methods prior to
presentation to a
human user.
Its main goal is the
management of
any
information overload,
propaganda or
errors, and the
signal-to-noise ratio. To do this the user's
profile is
compared to some
reference characteristics. These
characteristics may originate from the information item using a the content-based
approach, or from the user's social environment using the
collaborative filtering
approach. Filtering should never create a
filter bubble that
influences
biases or blind
conformity.
Filtering is not to be confused with
censorship. A filter is
not a
wall. Filtering is like
speed reading.
Retrieving the most essential information efficiently as possible.
Ratings -
Free Speech Abuses -
Social Network
Monitoring
Filter is a device that
removes something from whatever
passes through it. A
porous device for
removing impurities or solid particles from a liquid or gas passed through
it.
Porous is something full of pores or
vessels or holes allowing passage in and out.
Membranes -
Polarizers
Collaborative Filtering is a technique used by recommender systems.
Collaborative filtering
has two senses, a narrow one and a more general one.
Relative -
Tuning Out Irrelevant
Information -
Error
Correcting
Data Cleansing is the process of detecting and correcting or
removing corrupt or inaccurate records from
a record set, table, or database and refers to identifying incomplete,
incorrect, inaccurate or
irrelevant parts of the data and then replacing, modifying, or
deleting the dirty or coarse data. Data cleansing may be performed
interactively with data wrangling tools, or as batch processing through
scripting.
Gatekeeping
is the
process through which information is filtered for
dissemination, whether for publication, broadcasting, the Internet, or
some other mode of
communication.
Gatekeeping in communication is the process through which
information is filtered for dissemination,
whether for publication, broadcasting, the Internet, or some other mode of
communication. The academic theory of gatekeeping is founded in multiple
fields of study, including communication studies, journalism, political
science, and sociology. It was originally focused on the mass media with
its few-to-many dynamic but now gatekeeping theory also addresses
face-to-face communication and the many-to-many dynamic inherent in the
Internet. The theory was first instituted by social psychologist Kurt
Lewin in 1943. Gatekeeping occurs at all levels of the media
structure—from a reporter deciding which sources are chosen to include in
a story to editors deciding which stories are printed or covered, and
includes media outlet owners and even advertisers.
Wisdom Keeper.
Logic Gate
- And, Or, Not. (algorithm filters) -
Questioning
Gatekeeper are
individuals who decide whether a given message will be distributed by a
mass medium. Serve in various roles including academic admissions,
financial advising, and news editing.
Not to be confused with Mass Media.
Collaborative Filtering is the process of filtering for information or
patterns using techniques involving
collaboration among multiple agents,
viewpoints, data sources, etc. Sometimes making automatic
predictions
about the interests of a user by collecting preferences or taste
information from many users (collaborating).
Process of Elimination is
a quick way of finding an answer to a
problem by excluding
low probability answers so that you can focus on the
most probable answers. With
multiple
choices you can remove choices that are known to be incorrect so that
your
chances of getting the correct answer
are greater. It is a logical method to identify an entity of interest
among several ones by excluding all other entities.
In
educational
testing, the process of elimination is a process of deleting options
whereby the possibility of an option being correct is close to zero or
significantly lower compared to other options. This version of the process
does not guarantee success, even if only 1 option remains since it
eliminates possibilities merely as improbable.
Reason by Deduction -
Simplifying -
Data Conversion
Filter in signal processing is
a device or process that removes some unwanted components or
features from a signal. Filtering is a class of signal
processing, the defining feature of filters being the complete
or partial suppression of some aspect of the signal.
Noise.
Media Literacy -
Sensors -
Social Network
Blocking
Abstraction is the act of
withdrawing or
removing something. A general
concept
formed by extracting
common features from
specific examples. The process of
formulating general concepts by
abstracting common properties of instances. A concept or idea not
associated with any specific instance. Preoccupation with something to the
exclusion of all else.
Abstraction
is a conceptual process by which general rules and concepts are derived
from the usage and
classification of
specific examples. Conceptual
abstractions may be formed by filtering the
information content of a
concept or an observable phenomenon, selecting only the aspects which are
relevant for a particular
purpose.
Extracting is to reason by deduction a
principle or construe or make sense of a meaning. Extracting in chemistry
is to purify or isolate using distillation. Obtain from and separate a
substance, as by mechanical action. Extracting in mathematics is to
calculate the root of a number.
Extraction (information).
Extract, Transform, Load is the general procedure of copying data from
one or more sources into a destination system which represents the data
differently from the source(s) or in a different context than the
source(s).
Data Migration is the process of selecting, preparing, extracting, and
transforming data
and permanently transferring it from one computer storage system to
another.
Terminology
Extraction is a subtask of information extraction. The goal of
terminology extraction is to automatically extract relevant terms from a
given corpus. Collect a vocabulary of domain-relevant terms, constituting
the linguistic surface manifestation of domain concepts.
Data Cleansing is the process of detecting and correcting (or
removing) corrupt or inaccurate records from a record set, table, or
database and refers to identifying incomplete, incorrect, inaccurate or
irrelevant parts of the data and then replacing, modifying, or deleting
the dirty or coarse data. Data cleansing may be performed interactively
with data wrangling tools, or as batch processing through scripting.
Data Scraping is a technique in which a computer program extracts data
from human-readable output coming from another program.
Information Extraction.
Web Scraping is
data scraping used for extracting data from websites sometimes using a
web crawler.
Screen Scraping is the process of
collecting screen display data from one application and translating it so
that another application can display it. This is normally done to capture
data from a legacy application in order to display it using a more modern
user interface.
Data
Editing is defined as the process involving the review and adjustment
of collected survey data. The purpose is to control the quality of the
collected data. Data editing can be performed manually, with the
assistance of a computer or a combination of both.
Data Wrangling is the process of transforming and mapping data from
one "raw" data form into another format with the intent of making it more
appropriate and valuable for a variety of downstream purposes such as
analytics. A data wrangler is a person who performs these transformation
operations. This may include further munging, data visualization, data
aggregation, training a statistical model, as well as many other potential
uses. Data munging as a process typically follows a set of general steps
which begin with extracting the data in a raw form from the data source, "munging"
the raw data using algorithms (e.g. sorting) or parsing the data into
predefined data structures, and finally depositing the resulting content
into a data sink for storage and future use.
Noisy Text Analytics is a process of information extraction
whose goal is to automatically extract
structured or semistructured information from
noisy unstructured text data.
Fragmented -
Deconstructed
Noisy Text
noise can be seen as all the differences between the surface
form of a
coded representation of the text and the intended,
correct, or original text.
Deep Packet Inspection is
a form of computer network packet filtering that examines the data part
(and possibly also the header) of a packet as it passes an inspection
point, searching for protocol non-compliance, viruses, spam, intrusions,
or defined criteria to decide whether the packet may pass or if it needs
to be routed to a different destination, or, for the purpose of collecting
statistical information that functions at the Application layer of the OSI
(Open Systems Interconnection model).
Focus -
Attention
-
Multi-Tasking
Filtering information is not bad, as long as you are filtering
correctly and focused on a particular
goal so that only
relative information needs
to be
analyzed. The big problem is that
people block relative information
and then they
naively call it
filtering, which it is not. Most people are not aware that they are
blocking relative information, or are they aware that they have biases
against certain information. So the main difficulty is that people don't
have enough knowledge and information in order to filter information
without blocking relative information or important information. The
process of
information extraction needs to be learned and then practiced,
and also
verified in order to
make sure that the process is effective and efficient. Eventually
misinformation would be
totally eliminated from the media because it could never get through the
millions of people who are knowledgeable enough to quickly identify
false information and then remove
it and also stop the source from transmitting. Millions of filters
will be working together to keep information accurate, that is the future.
Filtering is a normal human process, it's just when you filter things,
what things you are filtering, and why you filter things that makes all
the difference.
Watch Dogs.
The Information Age
We are now living in the
Information Age.
A time where
information and
knowledge is so abundant that we can no
longer ignore it. But sadly, not everyone understands what information is, or understands
the potential information, or has
access to information.
The Information age is
the greatest transition of the human race, and of
our planet. The power of knowledge is just beginning to be realized.
Knowledge and information gives us an incredible ability to explore
ourselves, and explore our world and our universe in ways that we have never imagined.
Knowledge and information can improve the lives of every man, women and
child on this planet. Knowledge and information will also help us
understand the importance of all life forms on this planet like never
before. This is truly the
greatest
awakening of our world.
Preserving Information
-
Information Economy -
Knowledge Economy -
Knowledge Market
-
Knowledge Management
-
Information Literacy
-
Information Stations
-
Information Overload.
Knowledge Open to the Public
Libre Knowledge knowledge released in such a way that users
are free to read, listen to, watch, or otherwise experience it; to learn
from or with it; to copy, adapt and use it for any purpose; and to share
the work (unchanged or modified).
Knowledge Commons refers to information, data, and content
that is collectively owned and managed by a community of users,
particularly over the Internet. What distinguishes a knowledge commons
from a commons of shared physical resources is that digital resources are
non-subtractible; that is, multiple users can
access the same digital
resources with no effect on their quantity or quality.
Open Science -
Open Source
Education -
Internet
Open Knowledge is knowledge that one is free to use,
reuse, and redistribute without legal, social or technological
restriction. Open knowledge is a set of principles and methodologies
related to the production and distribution of knowledge works in an open
manner. Knowledge is interpreted broadly to include data, content and
general information.
Open Knowledge Initiative is an organization responsible for the
specification of software interfaces comprising a Service Oriented
Architecture (SOA) based on high level service definitions.
Open Access Publishing refers to online research outputs that are free
of all restrictions on access (e.g. access tolls) and free of many
restrictions on use (e.g. certain copyright and license restrictions)
Open Data is the idea that some data should be
freely
available to everyone to use and republish as they wish, without
restrictions from copyright, patents or other mechanisms of control.
Open Content describes a creative work that others can copy
or modify.
A Human Search Engine is a lot of work.
I have been working an average of
20 Hours a week since 1998 and over 50 Hours a week since 2006.
With over a
Billion Websites containing
over 450 billion web pages on the
World Wide Web, there's a lot of information to be
organized. And with almost 2 billion people on the internet
there's a lot of minds to collaborate with.
My Human Search Engine
Design Methods are always improving, but I'm definitely not
a professional
Website Architecture so there is always more to learn. I'm constantly
Multitasking so I do make mistakes from time to time,
especially with proof reading my own writing, which seems almost
impossible (
Writers
Blindness). This is why writers and authors have proof
readers and copy editors, which is something I cannot afford
right now, so please excuse me for my spelling errors and poor
grammar. Besides that I'm still making progress and I'm always
acquiring new knowledge, which always makes these projects
fascinating and never boring.
The Adventures in Learning
You can also look at my website as web Indexing.
Web indexing
means creating indexes for individual Web sites, intranets,
collections of HTML documents, or even collections of Web sites.
Web-indexing.org
Indexes are systematically arranged items, such as topics
or names, that serve as entry points to go directly to desired
information within a larger document or set of documents.
Indexes are traditionally alphabetically arranged. But they may
also make use of
Hierarchical Arrangements, as provided by thesauri, or they
may be entirely hierarchical, as in the case of taxonomies. An
index might not even be displayed, if it incorporated into a
searchable database.
Indexing is an analytic process of
determining which concepts are worth indexing, what entry labels
to use, and how to arrange the entries. As such, Web indexing is
best done by individuals skilled in the craft of indexing,
either through formal training or through self-taught reading
and study.
Indexing is a list of words or phrases ('headings') and
associated pointers ('locators') to where useful material relating to that
heading can be found in a document or collection of documents. Examples
are an index in the back matter of a book and an index that serves as a
library catalog.
A Web index is often a browsable list of
entries from which the user makes selections, but it may be non-displayed
and searched by the user typing into a search box. A site A-Z index is a
kind of Web index that resembles an alphabetical back-of-the-book style
index, where the index entries are hyperlinked directly to the appropriate
Web page or page section, rather than using page numbers.
Interwiki Links is a facility for creating links to the many
wikis on the World Wide Web. Users avoid pasting in entire URLs (as they
would for regular web pages) and instead use a shorthand similar to links
within the same wiki (intrawiki links).
I'm like an isle in the internet
library. Organizing data out of necessity while making it a
value to others at the same time. Eventually connecting to other
human search engines around the world to expand its reach and
capabilities.
I like to describe my website
as being kind of like a lateral
Blog
then the usual
Linear Blog because I update
multiple pages at once instead of just one. As of 2010 around
120,000 new weblogs are being created worldwide each day, but of
the 70 million weblogs that have been created only around 15.5
million are actually active. Though blogs and
User-Generated Content are useful to some extent I feel that
too much time and effort is wasted, especially if the
information and knowledge that is gained from a blog is not
organized and categorized in a way that readers can utilize and
access these archives like they would do with newspapers. This
way someone can build knowledge based evidence and facts to use
against corruption and incompetence. This
would probably take a
Central Location for all the
blogs to submit too. This way useful knowledge and information
is not lost in a sea of confusion. This is one of the reasons
why this websites information and links will continue to be
organized and updated so that the website continues to improve.
"
Links
in a Chain"
"There's a lot you don't know, welcome to web 3.0" This
is not just my version of the internet, this is my vision of the
internet. And this is not philosophy, it's just the best idea that I have so far until I can
find something better to add to it, or replace it, or change it.
A Think Tank who's only major influence is
Logic.
"When an old man dies, it's like entire library burning down to the ground. But not for
me, I'll just back it up on the internet."
Search Engines
Organic Search Engine is a search engine that uses human
participation to filter the
search results and assist users in clarifying
their search request. The goal is to provide users with a limited number
of relevant results, as opposed to traditional search engines that often
return a large number of results that may or may not be relevant.
Organic
Search is a method for entering one or a plurality of search
items in a single data string into a search engine. Organic search results
are listings on search engine results pages that appear because of their
relevance to the search terms, as opposed to their being advertisements.
In contrast, non-organic search results may include pay per click advertising.
Research -
Search Engine
TechnologySearch is to try to
locate or
discover
something, or try to establish the existence of something. The activity of
looking thoroughly in order to find
something or someone. An
investigation seeking
answers. The
examination
of alternative hypotheses.
Hybrid Search Engine is a type of computer search engine
that uses different types of data with or without ontologies to produce
the algorithmically generated results based on web crawling. Previous
types of search engines only use text to generate their results. Hybrid
search engines use a combination of both crawler-based results and
directory results. More and more search engines these days are moving to a
hybrid-based model.
Plant Trees while you
Search the Web. Ecosia search engine has
helped plant almost 18
million trees.
Question and
Answers Format
Search Engine in computing is an information retrieval system designed
to help find information stored on a computer system. The search results
are usually presented in a list and are commonly called hits. Search
engines help to minimize the time required to find information and the
amount of information which must be consulted, akin to other techniques
for managing information overload. The most public, visible form of a
search engine is a Web search engine which searches for information on the
World Wide Web.
Indirection is the ability to
Reference something using
a name, reference, or container instead of the value itself. The most
common form of indirection is the act of manipulating a value through its
memory address. For example, accessing a variable through the use of a
pointer. A stored pointer that exists to provide a reference to an object
by double indirection is called an indirection node. In some older
computer architectures, indirect words supported a variety of more-or-less
complicated addressing modes.
Probabilistic Relevance Model is a formalism of information retrieval
useful to derive ranking functions used by search engines and web search
engines in order to rank matching documents according to their relevance
to a given search query. It makes an estimation of the probability of
finding if a document dj is relevant to a query q. This model assumes that
this probability of relevance depends on the query and document
representations. Furthermore, it assumes that there is a portion of all
documents that is preferred by the user as the answer set for query q.
Such an ideal answer set is called R and should maximize the overall
probability of relevance to that user. The prediction is that documents in
this set R are relevant to the query, while documents not present in the
set are non-relevant.
Bayesian Network
is a probabilistic graphical model (a type of statistical model) that
represents a set of random
variables
and their conditional dependencies via a directed acyclic graph (DAG). For
example, a Bayesian network could represent the probabilistic
relationships between diseases and symptoms. Given symptoms, the network
can be used to compute the probabilities of the presence of various
diseases.
Bayesian
Inference is a method of statistical inference in which Bayes' theorem
is used to update the
probability for a hypothesis as more evidence or
information becomes available. Bayesian inference is an important
technique in statistics, and especially in mathematical statistics.
Bayesian updating is particularly important in the dynamic analysis of a
sequence of data. Bayesian inference has found application in a wide range
of activities, including science, engineering, philosophy, medicine,
sport, and law. In the philosophy of decision theory, Bayesian inference
is closely related to subjective probability, often called "Bayesian
probability".
Search Aggregator
is a type of metasearch engine which gathers results from
multiple search
engines simultaneously, typically through RSS search results. It combines
user specified search feeds (parameterized RSS feeds which return search
results) to give the user the same level of control over
content as a
general
aggregator, or a person who
collects things.
Metasearch Engine or aggregator) is a search tool that uses another
search engine's data to produce their
own results from the Internet. Metasearch engines take input from a user
and simultaneously send out queries to third party search engines for
results. Sufficient data is gathered, formatted by their ranks and
presented to the users.
Prospective Search
is a method of searching on the Internet where the query is given first
and the information for the results are then acquired. This differs from
traditional, or "retrospective", search such as search engines, where the
information for the results is acquired and then queried.
Multitask
Subject Indexing
is the act of describing or classifying a document by index terms or other
symbols in order to indicate what the document is about, to summarize its
content or to increase its findability. In other words, it is about
identifying and describing the subject of documents. Indexes are
constructed, separately, on three distinct levels: terms in a document
such as a book; objects in a collection such as a library; and documents
(such as books and articles) within a field of knowledge.
Search Engine Indexing collects, parses, and stores data to facilitate
fast and accurate information retrieval. Index design incorporates
interdisciplinary concepts from linguistics, cognitive psychology,
mathematics, informatics, and computer science. An alternate name for the
process in the context of search engines designed to find web pages on the
Internet is web indexing.
Indexing -
Amazing Numbers and Facts
Text
Mining also referred to as text data mining, roughly equivalent to
text analytics, refers to the process of deriving high-quality information
from text. High-quality information is typically derived through the
devising of patterns and trends through means such as statistical pattern
learning. Text mining usually involves the process of structuring the
input text (usually parsing, along with the addition of some derived
linguistic features and the removal of others, and subsequent insertion
into a database), deriving
patterns
within the structured data, and finally evaluation and interpretation of
the output. 'High quality' in text mining usually refers to some
combination of relevance, novelty, and interestingness. Typical text
mining tasks include text categorization, text clustering, concept/entity
extraction, production of granular taxonomies, sentiment analysis,
document summarization, and entity relation modeling (i.e., learning
relations between named entities). Text analysis involves information
retrieval, lexical analysis to study word frequency distributions, pattern
recognition, tagging/annotation, information extraction, data mining
techniques including link and association analysis, visualization, and
predictive analytics. The overarching goal is, essentially, to turn text
into data for analysis, via application of natural language processing
(NLP) and analytical methods. A typical application is to scan a set of
documents written in a natural language and either model the document set
for predictive classification purposes or populate a database or search
index with the information extracted.
Social Search is a behavior of retrieving and searching on a
social searching engine
that mainly searches user-generated content such as news, videos and
images related search queries on social media like Facebook, Twitter,
Instagram and Flickr. It is an enhanced version of web search that
combines traditional algorithms. The idea behind social search is that
instead of a machine deciding which pages should be returned for a
specific query based upon an
impersonal algorithm, results that are based on the human network of
the searcher might be more relevant to that specific user's needs.
Interactive Person to Person Search Engine
Gimmeyit search engine is a crowd-source-based search engine using
social media content to find relevant search results rather than the
traditional rank-based search engines that rely on routine cataloging and
indexing of website data. The crowd-source approach scans social media
sources in real-rime to find results based on current social "buzz" rather
than proprietary ranking algorithms being run against indexed sites. With
a crowd source approach, no websites are indexed and no storage of website
metadata is maintained.
Tagasauris -
Public Data
Selection-Based Search
is a search engine system in which the user invokes a search query using
only the mouse. A selection-based search system allows the user to search
the internet for more information about any keyword or phrase contained
within a document or webpage in any software application on his desktop
computer using the mouse.
Web Searching TipsRummage is to
search haphazardly through a jumble of things. Ransacking.
Web Portal is most often a specially designed web site that
brings information together from diverse sources in a uniform way.
Usually, each information source gets its dedicated area on the page for
displaying information (a portlet); often, the user can configure which
ones to display.
Networks -
Social Networks
Router
is a networking device that forwards data packets between computer
networks. Routers perform the traffic directing functions on the Internet.
A data packet is typically forwarded from one router to another through
the networks that constitute the internetwork until it reaches its
destination node.
Interface -
Computer -
Internet
-
Web of Life
Window to the World -
Open Source
A Human Search Engine also includes.
Archival Science
-
Archive -
Knowledge Management -
Library Science -
Information Science.
Reflective Practice
-
Research -
Science
Tracking -
Interdiscipline -
Thesaurus
Human-Based Computation
is a computer science technique in which
a
machine performs its function by outsourcing certain steps to humans,
usually as microwork. This approach uses differences in abilities and
alternative costs between humans and computer agents to achieve
symbiotic human-computer
interaction. In traditional computation, a human employs a computer to
solve a problem; a human provides a formalized problem description and an
algorithm
to a computer, and receives a solution to interpret. Human-based
computation frequently reverses the roles; the computer asks a person or a
large group of people to solve a problem, then collects, interprets, and
integrates their solutions.
Internet Searching Tips
"Knowing how to ask a question and knowing how to analyze the answers"
If on a website and you're using the
Firefox browser, if you right click on the page, and then click
on "Save Page As", it will save the entire page on your computer
so that can be view that page when you are off line, without the
need of an internet connection.
When searching the
internet you have to use more then one
search engine in order
to do a complete search. Using one search engine will narrow
your findings and possibly keep you from finding what you're
looking for because most search engines are not perfect and are
sometimes unorganized, flawed and manipulated. This is why I'm
organizing the Internet because search engines are flawed and
thus cannot be fully depended on for accuracy.
Adaptive
Search
Example:
Using the same exact keywords on 4 different
search engines I found the website that I was looking for at the
top in the number one position, on 2 of the 4 search engines,
and I could not find that same website on the other search
engine unless I searched several pages deep. So one search
engine is flawed or manipulated and the other search engine is
not. There are chances that the webpage you are looking for is
not titled correctly so you may have to use different keywords
or phrases in order to find it. But even then this is no
guarantee because search engines also use other factors when
calculating the results for particular words or phrases. And
what all those other factors are and how they work is not
exactly clear.
Search engines are in fact a highly important
Social Service, just like a
Congressman or
President, except not corrupted of course. If you honestly can not say exactly how and why you performed a
particular action, then how the hell are people supposed to
believe you or understand what they need to do in order to fix
your mistake or at least confirm there was no mistake?
Transparency,
Truth and knowing the
Facts for these particular services are absolutely
necessary. People have the right not to be part of a
Blind
Experiment. These Systems need to be
Open,
Monitored and
Audited in order for us to work accurately and efficiently.
When searching the Internet,
sometimes going
several pages deep on search engines will also help find
information, this is because the first 10 choices are sometimes
irrelevant. I have sometimes found things that I'm looking for
30 pages deep. You will also find different key words, phrases
and characters within the search results that may also help
increase your odds of finding what you're looking for. Sometimes
checking a websites links on their resources page may also help
you find websites that are not listed correctly in search
engines. Web Searching for Information needs to be a
science.
Human Search Engine Tips
Most search engines like Google have
Advanced Searching Tools found on the side or at the bottom
of their search pages.
Knowing where to
type in certain characters in your search phrases also helps you
find what you're looking for.
If you want to limit your searches on
Google to only education websites or government websites
then type in "site:edu" or "site:gov" after your key word or
search phrase.
For example Teaching Mathematical Concepts site:edu
For searching a specific website type in "neutrino
site:harvard.edu after the word or search phrase.
To narrow your searches to file types like PowerPoint, excel or
pdf's then type in filetype:ppt after the word.
For search ranges use 2 periods between 2 numbers, like "Wii
$200
..$300."
Using quotes or a
+ or
- within your search
phrases. Example, imagine you want to find pages that have
references to both President Obama and President Bush on the
same page.
You could search this way:
+President Obama+President
Bush
Or if you want to find pages that have just President Obama and
not President Bush then your search would be
President Obama
-President
Bush.
If you are looking for sand
sharks search engines will give you results with the word sand
and sharks but if you use quotation marks around
"sand
sharks
" it will help narrow your search.
Using "~" (tilde)
before a search term yields results with related terms.
Regular Expression is a sequence of characters that
define a search pattern. Usually this
pattern
is used by string searching algorithms for "find" or "find and replace"
operations on strings, or for input validation. It is a technique that
developed in theoretical computer science and formal language theory.
Conversions try
typing "50 miles in kilometers" or 100 dollars in Canadian
dollars.
Use Google to do
math just enter a calculation as you would into your computer's
calculator
(i.e. * corresponds to multiply, / to divide, etc)
To find a time in a certain place, then type in Time: Danbury, Ct.
Just got a phone call and want to see where the call is from,
then Type in 3
digit # area code.
Type any address into Google's main search bar for maps and
directions.
While on
Google Maps select the day of the week and the time of day
for the traffic forecast.
What are people Searching for and what Key words are they using
Search Query Trends -
Google Insights Search Trends -
Google
Trends -
Google
-
Yahoo Alexa Web Trends
You can learn even more great search tips by visiting this website
Search Engine Watch.
Learning
Boolean Logic can also help with improving your Internet searching skills.
Boolean Operators (youtube)
Algorithm Censorship
Google
censors search
results, while at the same time they killed thousands of small businesses, and
not only that, they influenced other people to censor information
and corrupt the system even more. Why do corporations get greedy and
criminal? And why do they cause others to repeat this
madness?
Money and
Power
is a cancer in the wrong hands. So the
Dragonfly censored secret search engine is nothing new. And on top of
that, people are getting bombarded with
robocalls because of google, and people have to visit websites littered
with adds because of google, which is abusive and one of the reasons why
google has been sued several times for millions of dollars. And the
lawsuits are not stopping these abuses.
Problems with Google -
Information Bubbles
Life through
Google's Eyes, Google's instant autocomplete that automatically fills
in words and phrases with search predictions and suggestions. Sometimes
with disturbing results.
Google
Algorithm, works OK most of the time, but it is also used to
censor websites unfairly.
Corruption at its worst.
Penguin (wiki) -
EMD
(wiki)
Panda (wiki) -
Google Bomb (wiki)
Criticism of Google (wiki)
Google Fined $1.7 Billion by EU for Blocking Advertising Rivals.
Alphabet's Google was fined $1.7 billion by the European Union for
limiting how some websites could display ads sold by its rivals.
Search Engine Failures
Algorithms
-
Search
Algorithm Human Search Engine
Internet -
Internet Safety
"If you are indexing information, that should be your focus.
If information is judged on irrelevant factors, then you will
fail to correctly distribute information, which will make
certain information in search results unreliable, illogical and corrupted."