Calls findMotifsGenome.pl with the -find option set to run a search for instances of specified motifs across a given set of regions. Results are then saved to a specified file. This function does not do motif finding, but rather uses a motif file to search for and return the exact regions in which motifs are found. Motifs are provided as matrices as generated by find_motifs_genome and read in by read_*_results.

find_motifs_instances(x, path, genome, motif_file, scan_size = "given",
  cores = parallel::detectCores(), cache = .calc_free_mem()/4)

Arguments

x

data.frame with the first three columns being chromosome, start, and end coordinates, with a fourth column corresponding to the region identifier; extra columns may be kept; x may alternately be a path to an existing bed file

path

path of file to save motif instances results to

genome

ID of installed genome; check installed genomes using list_homer_packages(); examples include "hg38" and "mm10"; add an 'r' at the end to mask repeats, e.g. "mm10r"

motif_file

path to file containing all instances of motifs to be scanned for; can be written be write_homer_motif

scan_size

size of sequence to scan; this can be a numeric to specify the number of bases to scan centered on the region, or alternately can be set to "given" to scan the entire region; if using "given", will use the "-chopify" option to cut large background sequences to average of target sequence size [default: given]

cores

number of cores to use [default: all cores available]

cache

number in MB to use as cache to store sequences in memory [default: calculates free memory and divides by 4]

Value

Nothing; called for its side-effect of producing HOMER results

See also

read_known_results, read_denovo_results